Github
Last updated: 3/6/2026, 7:02:51 PM
Recently Resolved Incidents
Incident with Webhooks
Mar 6, 18:39 UTC
Update - We continue working on mitigations to restore service.
Mar 6, 18:07 UTC
Update - We continue working on mitigations to restore full service.
Mar 6, 17:43 UTC
Update - Our engineers have identified the root cause and are actively implementing mitigations to restore full service.
Mar 6, 17:19 UTC
Update - This problem is impacting less than 1% of UI and webhook API calls.
Mar 6, 17:12 UTC
Update - We are investigating an issue affecting a subset of customers experiencing errors when viewing webhook delivery histories and retrying webhook deliveries. UI and webhook API is impacted. Engineers have identified the cause and are actively working on mitigation.
Mar 6, 16:58 UTC
Investigating - We are investigating reports of degraded performance for Webhooks
Actions is experiencing degraded availability
Mar 5, 23:55 UTC
Resolved - This incident has been resolved. Thank you for your patience and understanding as we addressed this issue. A detailed root cause analysis will be shared as soon as it is available.
Mar 5, 23:40 UTC
Update - We are close to full recovery. Actions and dependent services should be functioning normally now.
Mar 5, 23:37 UTC
Update - Actions is experiencing degraded performance. We are continuing to investigate.
Mar 5, 23:15 UTC
Update - Actions and dependent services, including Pages, are recovering.
Mar 5, 23:00 UTC
Update - We applied a mitigation and we should see a recovery soon.
Mar 5, 22:54 UTC
Update - Actions is experiencing degraded availability. We are continuing to investigate.
Mar 5, 22:53 UTC
Investigating - We are investigating reports of degraded performance for Actions
Multiple services are affected, service degradation
Mar 5, 19:30 UTC
Resolved - On Mar 5, 2026, between 16:24 UTC and 19:30 UTC, Actions was degraded. During this time, 95% of workflow runs failed to start within 5 minutes with an average delay of 30 minutes and 10% workflow runs failed with an infrastructure error. This was due to Redis infrastructure updates that were being rolled out to production to improve our resiliency. These changes introduced a set of incorrect configuration change into our Redis load balancer causing internal traffic to be routed to an incorrect host leading to two incidents.
We mitigated this incident by correcting the misconfigured load balancer. Actions jobs were running successfully starting at 17:24 UTC. The remaining time until we closed the incident was burning through the queue of jobs.
We immediately rolled back the updates that were a contributing factor and have frozen all changes in this area until we have completed follow-up work from this. We are working to improve our automation to ensure incorrect configuration changes are not able to propagate through our infrastructure. We are also working on improved alerting to catch misconfigured load balancers before it becomes an incident. Additionally, we are updating the Redis client configuration in Actions to improve resiliency to brief cache interruptions.
Mar 5, 19:17 UTC
Update - Webhooks is operating normally.
Mar 5, 19:05 UTC
Update - Actions is operating normally.
Mar 5, 18:59 UTC
Update - Actions is now fully recovered.
Mar 5, 18:15 UTC
Update - The queue of requested Actions jobs continues to make progress. Job delays are now approximately 6 minutes and continuing to decrease.
Mar 5, 17:48 UTC
Update - We are back to queueing Actions workflow runs at nominal rates and we are monitoring the clearing of queued runs during the incident.
Mar 5, 17:25 UTC
Update - We have applied mitigations for connection failures across backend resources and we are observing a recovery in queueing Actions workflow runs.
Mar 5, 16:52 UTC
Update - We are observing delays in queuing Actions workflow runs. We’re still investigating the causes of these delays.
Mar 5, 16:47 UTC
Update - Webhooks is experiencing degraded availability. We are continuing to investigate.
Mar 5, 16:41 UTC
Update - Actions is experiencing degraded availability. We are continuing to investigate.
Mar 5, 16:35 UTC
Investigating - We are investigating reports of degraded performance for Actions
Disruption with some GitHub services
Mar 5, 01:30 UTC
Resolved - This incident has been resolved. Thank you for your patience and understanding as we addressed this issue. A detailed root cause analysis will be shared as soon as it is available.
Mar 5, 01:30 UTC
Update - Copilot coding agent mission control is fully restored. Tasks are now listed as expected.
Mar 5, 01:21 UTC
Update - Users were temporarily unable to see tasks listed in mission control surfaces. The ability to submit new tasks, view existing tasks via direct link, or manage tasks was unaffected throughout. A revert is currently being deployed and we are seeing recovery.
Mar 5, 01:13 UTC
Investigating - We are investigating reports of impacted performance for some GitHub services.
Some OpenAI models degraded in Copilot
Mar 5, 01:13 UTC
Resolved - This incident has been resolved. Thank you for your patience and understanding as we addressed this issue. A detailed root cause analysis will be shared as soon as it is available.
Mar 5, 01:13 UTC
Update - The issues with our upstream model provider have been resolved, and gpt-5.3-codex is once again available in Copilot Chat and across IDE integrations. We will continue monitoring to ensure stability, but mitigation is complete.
Mar 5, 00:53 UTC
Update - We are experiencing degraded availability for the gpt-5.3-codex model in Copilot Chat, VS Code and other Copilot products. This is due to an issue with an upstream model provider. We are working with them to resolve the issue.
Mar 5, 00:47 UTC
Investigating - We are investigating reports of degraded performance for Copilot
Claude Opus 4.6 Fast not appearing for some Copilot users
Mar 3, 21:11 UTC
Resolved - On March 3, 2026, between 19:44 UTC and 21:05 UTC, some GitHub Copilot users reported that the Claude Opus 4.6 Fast model was no longer available in their IDE model selection. After investigation, we confirmed that this was caused by enterprise administrators adjusting their organization's model policies, which correctly removed the model for users in those organizations. No users outside the affected organizations lost access.
We confirmed that the Copilot settings were functioning as designed, and all expected users retained access to the model. The incident was resolved once we verified that the change was intentional and no platform regression had occurred.
Mar 3, 21:05 UTC
Update - We believe that all expected users still have access to Claude Opus 4.6. We confirm that no users have lost access.
Mar 3, 20:31 UTC
Investigating - We are investigating reports of degraded performance for Copilot
Incident with all GitHub services
Mar 3, 20:09 UTC
Resolved - On March 3, 2026, between 18:46 UTC and 20:09 UTC, GitHub experienced a period of degraded availability impacting GitHub.com, the GitHub API, GitHub Actions, Git operations, GitHub Copilot, and other dependent services. At the peak of the incident, GitHub.com request failures reached approximately 40%. During the same period, approximately 43% of GitHub API requests failed. Git operations over HTTP had an error rate of approximately 6%, while SSH was not impacted. GitHub Copilot requests had an error rate of approximately 21%. GitHub Actions experienced less than 1% impact.
This incident shared the same underlying cause as an incident in early February where we saw a large volume of writes to the user settings caching mechanism. While deploying a change to reduce the burden of these writes, a bug caused every user’s cache to expire, get recalculated, and get rewritten. The increased load caused replication delays that cascaded down to all affected services. We mitigated this issue by immediately rolling back the faulty deployment.
We understand these incidents disrupted the workflows of developers. While we have made substantial, long-term investments in how GitHub is built and operated to improve resilience, we acknowledge we have more work to do. Getting there requires deep architectural work that is already underway, as well as urgent, targeted improvements. We are taking the following immediate steps:
- We have added a killswitch and improved monitoring to the caching mechanism to ensure we are notified before there is user impact and can respond swiftly.
- We are moving the cache mechanism to a dedicated host, ensuring that any future issues will solely affect services that rely on it.
Mar 3, 20:06 UTC
Update - We're seeing recovery across all services. We're continuing to monitor for full recovery.
Mar 3, 19:55 UTC
Update - Actions is operating normally.
Mar 3, 19:54 UTC
Update - Git Operations is operating normally.
Mar 3, 19:36 UTC
Update - Git Operations is experiencing degraded availability. We are continuing to investigate.
Mar 3, 19:33 UTC
Update - We are seeing recovery across multiple services. Impact is mostly isolated to git operations at this point, we continue to investigate
Mar 3, 19:31 UTC
Update - Copilot is operating normally.
Mar 3, 19:31 UTC
Update - Pull Requests is operating normally.
Mar 3, 19:28 UTC
Update - Pull Requests is experiencing degraded performance. We are continuing to investigate.
Mar 3, 19:27 UTC
Update - Issues is operating normally.
Mar 3, 19:25 UTC
Update - Webhooks is operating normally.
Mar 3, 19:25 UTC
Update - Codespaces is operating normally.
Mar 3, 19:24 UTC
Update - Webhooks is experiencing degraded performance. We are continuing to investigate.
Mar 3, 19:23 UTC
Update - Issues is experiencing degraded performance. We are continuing to investigate.
Mar 3, 19:17 UTC
Update - We've identified the issue and have applied a mitigation. We're seeing recovery of services. We continue to montitor for full recovery.
Mar 3, 19:15 UTC
Update - API Requests is operating normally.
Mar 3, 19:14 UTC
Update - API Requests is experiencing degraded performance. We are continuing to investigate.
Mar 3, 19:11 UTC
Update - Codespaces is experiencing degraded performance. We are continuing to investigate.
Mar 3, 19:05 UTC
Update - Pull Requests is experiencing degraded availability. We are continuing to investigate.
Mar 3, 19:04 UTC
Update - Webhooks is experiencing degraded availability. We are continuing to investigate.
Mar 3, 19:03 UTC
Update - We're seeing some service degradation across GitHub services. We're currently investigating impact.
Mar 3, 19:02 UTC
Update - Webhooks is experiencing degraded performance. We are continuing to investigate.
Mar 3, 19:00 UTC
Update - Pull Requests is experiencing degraded performance. We are continuing to investigate.
Mar 3, 19:00 UTC
Update - API Requests is experiencing degraded availability. We are continuing to investigate.
Mar 3, 18:59 UTC
Investigating - We are investigating reports of degraded availability for Actions, Copilot and Issues
Delayed visibility of newly added issues on project boards
Mar 3, 05:54 UTC
Resolved - Between March 2, 21:42 UTC and March 3, 05:54 UTC project board updates, including adding new issues, PRs, and draft items to boards, were delayed from 30 minutes to over 2 hours, as a large backlog of messages accumulated in the Projects data denormalization pipeline.
The incident was caused by an anomalously large event that required longer processing time than expected. Processing this message exceeded the Kafka consumer heartbeat timeout, triggering repeated consumer group rebalances. As a result, the consumer group was unable to make forward progress, creating head-of-line blocking that delayed processing of subsequent project board updates.
We mitigated the issue by deploying a targeted fix that safely bypassed the offending message and allowed normal message consumption to resume. Consumer group stability recovered at 04:10 UTC, after which the backlog began draining. All queued messages were fully processed by 05:53 UTC, returning project board updates to normal processing latency.
We have identified several follow-up improvements to reduce the likelihood and impact of similar incidents in the future, including improved monitoring and alerting, as well as introducing limits for unusually large project events.
Mar 3, 05:53 UTC
Update - This incident has been resolved. Project board updates are now processing in near-real-time.
Mar 3, 04:36 UTC
Update - The backlog of delayed updates is expected to fully clear within approximately 1 hour, after which project board updates will return to near-real-time.
Mar 3, 04:17 UTC
Update - The fix has been deployed and processing speeds have returned to normal. There is a backlog of delayed updates that will continue to be worked through — we're estimating how long that will take and will provide an update in the next 60 minutes.
Mar 3, 03:22 UTC
Update - The fix is still building and is expected to deploy within 60 minutes. The current delay for GitHub Projects updates has increased to up to 5 hours.
Mar 3, 02:27 UTC
Update - We're deploying a fix targeting the increased delay in GitHub Projects updates. The rollout should complete within 60 minutes. If successful, the current delay of up to 4 hours should begin to decrease.
Mar 3, 01:40 UTC
Update - The delay for project board updates has increased to up to 3 hours. We've identified a potential cause and are working on remediation.
Mar 3, 00:52 UTC
Update - Project board updates — including adding issues, pull requests, and changing fields such as "Status" — are currently delayed by 1–2 hours. Normal behavior is near-real-time. We're actively investigating the root cause.
Mar 3, 00:05 UTC
Update - The impact extends beyond adding issues to project boards. Adding pull requests and updating fields such as "Status" may also be affected. We're continuing to investigate the root cause.
Mar 2, 23:46 UTC
Update - Newly added issues are taking 30–60 minutes to appear on project boards, compared to the normal near-real-time behavior. We're investigating the root cause and possible mitigations.
Mar 2, 23:12 UTC
Update - Newly added issues can take up to 30 minutes to appear on project boards. We're investigating the cause of this delay.
Mar 2, 23:11 UTC
Update - Issues is experiencing degraded performance. We are continuing to investigate.
Mar 2, 23:10 UTC
Investigating - We are investigating reports of impacted performance for some GitHub services.
Incident with Pull Requests /pulls
Mar 2, 22:04 UTC
Resolved - On March 2nd, 2026, between 7:10 UTC and 22:04 UTC the pull requests service was degraded. Users navigating between tabs on the pull requests dashboard were met with 404 errors or blank pages.
This was due to a configuration change deployed on February 27th at 11:03 PM UTC. We mitigated the incident by reverting the change.
We’re working to improve monitoring for the page to automatically detect and alert us to routing failures.
Mar 2, 22:04 UTC
Update - The issue on https://github.com/pulls is now fully resolved. All tabs are working again.
Mar 2, 21:04 UTC
Update - We're deploying a fix for pull request filtering. Full rollout across all regions is expected within 60 minutes.
Mar 2, 20:02 UTC
Update - We are experiencing issues with the Pull Requests dashboard that prevent users from filtering their pull requests. We have identified a mitigation and are deploying a fix. We'll post another update by 21:00 UTC.
Mar 2, 19:23 UTC
Update - We are seeing a degraded experience when attempting to filter the /pulls dashboard. We are working on a mitigation.
Mar 2, 19:11 UTC
Investigating - We are investigating reports of degraded performance for Pull Requests
Incident with Copilot agent sessions
Feb 27, 23:49 UTC
Resolved - On February 27, 2026, between 22:53 UTC and 23:46 UTC, the Copilot coding agent service experienced elevated errors and degraded functionality for agent sessions. Approximately 87% of attempts to start or interact with agent sessions encountered errors during this period.
This was due to an expired authentication credential for an internal service component, which prevented Copilot agent session operations from completing successfully.
We mitigated the incident by rotating the expired credential and deploying the updated configuration to production. Services began recovering within minutes of the fix being deployed.
We are working to improve automated credential rotation coverage across all Copilot service components, add proactive alerting for credentials approaching expiration, and validate configuration consistency to reduce our time to detection and mitigation of issues like this one in the future.
Feb 27, 23:45 UTC
Update - We have identified the cause of the elevated errors and are rolling out a fix to production. We are observing initial recovery in Copilot agent sessions.
Feb 27, 23:35 UTC
Update - We are investigating networking issues with some requests to our models.
Feb 27, 23:18 UTC
Update - We are investigating a spike in errors in Copilot agent sessions
Feb 27, 23:18 UTC
Investigating - We are investigating reports of degraded performance for Copilot
Showing 10 of 25 resolved incidents