It’s been almost two weeks since Amazon’s EC2 and RDS service took a fall sending an enormous number of websites and web apps crumbling to their feet.
One of the biggest criticisms of the whole affair was Amazon’s deafly silence. As startups and businesses were left to apologise to their customers because of Amazon’s failure. Amazon maintained its silence.
Today, two weeks later, Amazon has issued an apology and a post mortem – found here. What caused the issue? In summary, it appears the entire issue stems from an operator error where one (human) mistake caused the initial outage and before long began affecting other facets of Amazon’s service. Amazon plans to increase automation in an effort to prevent human error going forward.
In its apology Amazon say the following:
“…we want to apologize. We know how critical our services are to our customers’ businesses and we will do everything we can to learn from this event and use it to drive improvement across our services. As with any significant operational issue, we will spend many hours over the coming days and weeks improving our understanding of the details of the various parts of this event and determining how to make changes to improve our services and processes.”
Amazon has also issued a 10 day credit equal to customers in the affected zone.
The biggest question remains, what impact will this have on Amazon’s web services business and will we see an immediate distrust in the cloud? I like to think not but this combined with the hacking of Sony’s Playstation Network is unquestionably going to leave many customers (both business and consumer) hesitant about the future of a cloud based world.