On Monday, GitHub.com was down. While the company took quite a while to get the issue completely fixed, the service was only completely offline (Major Service Outage as opposed to Partial Service Outage) for a grand total of 19 minutes and 59 seconds. On Tuesday (today), GitHub is down again, and at the time of writing it’s already been longer than half an hour.
A quick check on downforeveryoneorjustme.com shows that GitHub is indeed offline around the world: “It’s not just you! http://github.com looks down from here.” Another site, isitdownrightnow.com, says the same: “Github.com is DOWN for everyone. It is not just you. The server is not responding…”
A new era of tech events has begun
We’re back in New York this November for the 4th edition of our growth-focused technology event.
Over on status.github.com, we can see that today’s problem started early this morning (this log is being updated, so refresh if you want the latest):
07:27 AM PST: Processing through a queue backlog. We’ll update when we’re all caught up.
07:42 AM PST: All caught up.
08:19 AM PST: Connectivity problems. Investigating
08:25 AM PST: Investigating DB problems
08:35 AM PST: We’ve taken a bad DB down and are working to return the DB cluster to a normal state now.
08:58 AM PST: DB cluster is slowly recovering.
09:10 AM PST: Performance is still impacted such that the majority of requests are hitting unicorns. Caches are slowly warming.
09:27 AM PST: Unicorns are decreasing in frequency.
09:38 AM PST: Performance is returning to normal.
09:50 AM PST: Performance is largely back to normal. We’ll be following up with more information on this outage and our plans to resolve the issues on the blog soon.
01:56 PM PST: Issue, Repository, and User search indices may be returning incorrect results after today’s DB maintenance. We’re rebuilding the search index now.
07:09 PM PST: Search indexes are still missing some results. We’re working on backfilling through the night.
Yesterday’s log sheds some light on how long today’s issue might last:
07:05 AM PST: Investigating database problems.
07:13 AM PST: We failed over to one of our secondary DBs. Services are recovering, but slow.
07:19 AM PST: Back down. Investigating.
07:22 AM PST: Back on the primary DB and recovering slowly.
07:54 AM PST: Service is recovering. Performance will continue to be degraded for a short while.
09:29 AM PST: All systems go.
The downtime has also been confirmed by GitHub on Twitter:
We’re down for emergency db maintenance. Investigating now. Updates at status.github.com
— GitHub (@github) September 11, 2012
Update at 9:20 AM PST: The service is starting to come back now. Total downtime would be 49 minutes and 50 seconds. GitHub’s status page, however, still says “Major Service Outage.” I will update you when this changes.
Update at 9:35 AM PST: We’re at “Partial service outage” now. Things are looking up.
Update at 7:00 PM PST: We’re at “Battle station fully operational.”
Update on September 14: GitHub availability this week
Image credit: stock.xchng