This article was published on December 7, 2010

Tumblr Explains Its 24 Hours of Downtime

Tumblr Explains Its 24 Hours of Downtime
Adam Mills
Story by

Adam Mills

Adam is a technology blogger based in San Francisco, California who loves his iPhone 3GS and Motorola Droid 2 equally. You can follow him on Adam is a technology blogger based in San Francisco, California who loves his iPhone 3GS and Motorola Droid 2 equally. You can follow him on Twitter or reach him by email at [email protected]

There were rumors abound as to why microblogging site Tumblr went down for about 24 hours and most of them were centered on the folks over at 4chan. Turns out, at least according to Tumblr, that it wasn’t 4chan after all.

It was an issue during a planned maintenance period.

Tumblr, in an official blog post from CEO David Karp (that’s right, they can finally post on their own blog again) states that during  planned maintenance that wasn’t supposed to interrupt service, a problem took down a critical database cluster which in turn brought the entire network down. After that, their engineers worked round the clock to bring it back online, an effort that took an entire day.

That’s it.

That’s the entire explanation and one that is not going to sit well with its user base some of which still can’t access their blogs, Timeline or Dashboard.

For a site that not only sits in the top 50 U.S. websites in terms of traffic and one that just recently ventured into the world of e-commerce, 24 hours of downtime is horrific to say the least. Tumblr even admits that they weren’t prepared for the amount of traffic that hits the site on a monthly basis stating that:

Frankly, keeping up with growth has presented more work than our small team was prepared for — with traffic now climbing more than 500M pageviews each month. But we are determined and focused on bringing our infrastructure well ahead of capacity as quickly as possible. We’ve nearly quadrupled our engineering team this month alone, and continue to distribute and enhance our architecture to be more resilient to failures like today’s.

The impact of the outage remains to be seen. Many users, on here and on other forums like Twitter, said they planned on leaving the site for good because of this. And if they weren’t threatening, they were complaining. And if they weren’t complaining they were laughing about how ridiculous it was that a site as big and as popular as Tumblr’s could be down for 24 hours.

Some users even said that the downtime didn’t bother them and not because they were patient but because happened so frequently, something that Karp even acknowledges in his post:

While you might feel like you’ve gotten used to seeing errors on Tumblr recently, know that this is absolutely unacceptable to our team, and unacceptable for a platform determined to be the best place in the world for your creative expression.

While there are many lessons to be learned here but if there is one greater than the others, it’s this.

If Tumblr plans on expanding, plans on becoming a major player, plans on keeping a large user base, then things like the past 24 hours can never happen again.