Let’s be frank: when AWS goes down, the entire technology community cries in unison. That’s the power of cloud systems that applications are built upon; take down the platform that supports the services you love and cherish, and before you can not snap an Instagram, life goes Luddite.
This sort of failure, what Compuware calls a ‘systemic Internet outage’ – “when failed Cloud or third-party web services slow or knock out other websites in a ‘ripple effect'” – are not simply an annoyance, they are a fact of digital life. However, as always, more information is power.
Enter Compuware’s new tool, Outage Analyzer, which tracks systemic outages around the world, judging them based on how certain it is of their existence, and what likely caused the failure. Here’s a shot taken from the service last night:
That outage, as you can see, concerns part of the Microsoft’s Azure platform. The outage appears to be in the early stages – note how the last update and outage start time are in sync – which is matched by a very low certainty rating. Other outages you will find are marked at 100%.
How does it work? Tapping into Compuware’s data stream, Outage Analyzer employs a ‘proprietary anomaly detection engine’ to uncover, pinpoint, and find the cause of disruptions around the world.
One of the more interesting pieces of Outage Analyzer is the ability to rewind and re-watch outages; that feature may be useful for developers looking for a way to track early signs of trouble, allowing them to spool up redundant capacity, perhaps.
But even if you don’t code, it’s fun to scroll through the world and see what is down, and how long folks have been suffering.
Top Image Credit: Intel Free Press