
GitHub, a critical service for millions of organizations, suffered 49 minutes or more of Git downtime on January 13th, following a faulty configuration change, highlighting that dependency on this cloud service is not without risk.
GitHub’s official status report stated Git downtime of 49 minutes but some reported a longer outage. “It was down for ~2 hours … either GitHub didn’t know how to communicate, or they were not sure about the real impact,” said one user.
This was perhaps the worst GitHub outage since 14 August 2024, when all GitHub services were inaccessible for all users for a period, again caused by a faulty configuration change, in that case to GitHub.com databases.
In August the Microsoft-owned DevOps giant stated that “to prevent recurrence, we are implementing additional guardrails in our database change management process” as well as “more resilience to dependency failures.”
This time around, the issue was with the configuration of the internal load balancer, and the company made a similar promise, saying it would improve its “monitoring and deployment practices to reduce our time to detection and automated mitigation for issues like this in future.”
Git is the distributed version control system that is at the heart of GitHub repositories. Although it was not the whole of GitHub that failed, Git is a critical dependency for many other services, which depend on it to retrieve the latest and correct version of the code in a repository.
One mitigating factor is that distributed version control systems like Git enable developers to keep working using the copy of the repository on their own machines, so a short outage is manageable. There is still a problem though with CI/CD (Continuous integration/Continuous deployment) systems using for example GitHub Actions, or that are set up to use GitHub repositories, impacting deployment.
Some developers misinterpreted what was happening, causing them extra work. For example, this one received the message “permission denied (publickey). Fatal: could not read from remote repository,” causing them to check SSH keys; as did another who “spent the last hour troubleshooting this deleting/re-adding keys, restarting my servers, generating new keys.” The lesson, perhaps, is always to check GitHub status before further troubleshooting.
GitHub has over 100 million developers and even a short outage is consequential. Is self-hosting less risky? “We self-host GitHub using GitHub Enterprise Server … I would say we’ve easily exceeded GitHub.com’s uptime in the last 12 months,” said a comment on the latest incident. That said, self-hosting has its own risks, and the unparalleled scale of GitHub.com gives it global resources that self-hosting cannot match.