MICROSCORE-4348 Micros API outage to upgrade RDSs from db.t4g.large to db.m8g.large

Resolved Scheduled Maintenance

Started

Wed, Apr 30, 2025, 06:43 AM UTC

Last Updated

Wed, Apr 30, 2025, 06:43 AM UTC

Resolved

Wed, Apr 30, 2025, 07:00 AM UTC

This maintenance window has been completed (Total duration: 16m)

AI-Powered Analysis

Impact Severity

minor

Incident History

Apr 30, 06:43 UTC
Completed - okie dokie, RDSs have been upgraded as quickly as expected, nothing related seems to have exploded in the time since, so i think we're in the clear, yay

Apr 30, 06:31 UTC
Update - update: from the AWS console for commercial production:
- April 30, 2025, 16:09 (UTC+10:00) Multi-AZ instance failover completed
- April 30, 2025, 16:09 (UTC+10:00) The RDS instance was modified by customer.
- April 30, 2025, 16:08 (UTC+10:00) DB instance restarted
- April 30, 2025, 16:08 (UTC+10:00) The parameter max_wal_senders was set to a value incompatible with replication. It has been adjusted from 20 to 65.
- April 30, 2025, 16:08 (UTC+10:00) Multi-AZ instance failover started.
- April 30, 2025, 16:01 (UTC+10:00) Applying modification to database instance class
so it looks like that took 2 minutes from the RDS perspective, and we did see some 5XX responses from the Micros API during this time

Apr 30, 06:04 UTC
Update - update: we've kicked off the deployments:
- commercial production: https://deployment-bamboo.internal.atlassian.com/deploy/viewDeploymentResult.action?deploymentResultId=3456381285
- FedRAMP-moderate production: https://deployment-bamboo.internal.atlassian.com/deploy/viewDeploymentResult.action?deploymentResultId=3456381287
- it's hard to predict exactly when, but there should be separate 2-3 minute outages, and those should start 10-15 minutes from now

Apr 30, 05:32 UTC
Scheduled - - 2x separate periods of outage, each expected to be 2-3 minutes
- Micros API for commercial production from db.t4g.large to db.m8g.large
- Micros API for FedRAMP-moderate from db.t4g.large to db.m8g.large
- https://hello.atlassian.net/wiki/spaces/MCORE/pages/5218781089/LDR+UA-13147+AWS+deadline+versus+micros-server+RDSs
- https://hello.jira.atlassian.cloud/browse/MICROSCORE-4348

Resilimap

MICROSCORE-4348 Micros API outage to upgrade RDSs from db.t4g.large to db.m8g.large

AI-Powered Analysis

Incident History

External Resources