There’s an element of fear that organizations have to overcome when moving their Web services off-premise and into the public cloud, especially for companies like Netflix that handle such a large chunk of the world’s traffic. After all, switching to a third-party infrastructure means relinquishing full control over your stack.
To counter the fog of war that the cloud brings, a growing number of companies have turned to Boundary for insight into their network performance. Boundary makes one of the only real-time virtual monitoring tools that re-introduces deep visibility to off-site cloud operations like AWS.
“This event was off the charts”
Gary Vaynerchuk was so impressed with TNW Conference 2016 he paused mid-talk to applaud us.
Back in May, Boundary announced a major upgrade to its service that adds a wealth of new metrics, notification options and collaborative messaging features that should offer clients even greater assistance in assuring that their websites have the best possible uptimes.
In a statement, Boundary CEO Gary Read made some fairly bombastic claims about how his company stacks up against the competition now:
With today’s announcement we have effectively made all previous monitoring tools redundant. They will soon be consigned to the history books of technology. Legacy IT monitoring tools are the slide rules to the Boundary digital calculator, totally defunct and obsolete. With users demanding highly responsive applications, attempting to depend on prehistoric once-a-minute or worse sampling periods for monitoring will cost you money, customers and probably your job.
We’ve been curious about how Boundary’s technology works, so we spoke with the firm and its partners to learn more about the role it plays in helping you binge-watch that extra episode of House of Cards on launch day.
“In a nutshell, the streaming business is the perfect place for leveraging cloud infrastructure because of the flexible scalable nature,” Scott Fingerhut, Boundary’s VP of Worldwide Marketing, said in an interview. “It’s a live supply chain. The model before was you had to predict the highs and buy everything for the highs. If you were hosting a site, you needed to have the max.”
Fingerhut added that, while cloud providers like AWS have “completely upended” the traditional hosting model by allowing organizations to spin up hundreds of thousands of servers in minutes, doing so comes with a cost.
“In outsourcing, you’re losing visibility. You’re infrastructure blind when you go out to those places,” he said.
Boundary steps in by providing real-time metrics on individual node performance. Fingerhut compared the product to having a high-definition recording of a traffic pileup that you can rewind to find the first car that caused the accident.
“One of the core pieces we have is extremely lightweight meters that are put on either the physical or virtual machines,” he said, adding that it’s just a few lines of C code.
The meters then collect information about what’s moving in or out of a service or node. Companies can then use the data to build a topology of the flow of their network.
Boundary processes over a trillion metrics for its customers per day on average, with 500 billion of those coming just from AWS. The service provides feedback in seconds, compared to traditional diagnostics that take as long as an hour to notify when a problem occurs.
For instance, AWS had a couple outages on the East Coast a few months back. One of Boundary’s clients saw the degradation occurring in real-time and moved its computing power to a different server without skipping a beat.
Another benefit for Boundary customers is using the service to optimize their networks by tracking down any wasted computing power. One cloud storage company used Boundary to discover that pirates were using its service to illegally stream media in Turkey and Greece. By blocking the activity, the firm saved over $60,000 in hardware costs, not to mention management time.
Boundary helped give Netflix the confidence it needed to switch to AWS instead of building its own. While we didn’t manage to get an official comment from the company, we did speak to Ariel Tseitlin, Netflix’s former head of cloud solutions and now an investor at Scale Venture Partners. To be clear, Tseitlin no longer represents Netflix, but he is able to share publicly available insights from his time at the company. Netflix, for its part, declined to comment for this article, citing company policy.
“Netflix itself used to be a monolithic application back when it was still running in the data center,” Tseitlin said. “What’ you’re seeing now is the disintermediation of that. Now it’s split into the service-oriented architecture.”
The company has now evolved into an API-driven service where every piece of functionality is accessed via either an internal or external API.
“That’s the norm for how applications are built and scaled horizontally. It lets you have these levers and elasticity for every piece of your architecture,” he added.
However, the main challenge of service-oriented architecture is that it adds management complexity, as there’s no single entity to monitor.
“Now you’re dealing with dozens, often hundreds of services that have to come together in a seamless way in order to deliver the user experience,” Tseitlin said, adding that even 99.9-percent uptime for individual components would result in low single-digit availability when combining a hundred different services.
“That’s where tools like Boundary can really differentiate themselves,” he continued. “They give you visibility into all the different services that are running on your architecture… It lets you come up with an architecture that is transparent to your end users.”
Tseitlin cited an example of Netflix’s movie recommendations. While the personalization is a significant benefit to the service, the last thing Netflix wants is for the whole service to go down if there’s a problem with recommendations. So if the personalization engine goes down, Netflix can just serve a generic list of top-ranked movies or a cache of earlier recommendations. Like many other networks, the system had to be architected to handle individual component failure.
“Boundary makes AWS more valuable because it takes away one of the major concerns of moving into that environment because you lose that visibility by moving into that infrastructure layer,” Tseitlin said.
While Netflix had more control when it managed its own data centers, Tseitlin admitted that the company had experienced more outages running its network in-house than when it moved into the public cloud.
Scripps Networks, the parent company of popular TV properties like the Food Network, HGTV and the Travel Channel, also uses Boundary to monitor all of its AWS implementations. Allen Shacklock, Scripps Networks’ lead cloud architect, noted that the extra visibility came in handy for monitoring throughput of all the user-generated content at Food.com during last year’s holiday cooking rush.
Scripps began switching its brand websites over to AWS more than a year ago, with plans to also move its streaming traffic and content over throughout 2014. The goal is to move completely off-premise with the help of AWS.
“A lot of what we’re looking for is reliability and availability. Scripps isn’t technically a technology company. We face more issues with reliability and availability within our own data center,” Shacklock said. “Our development community really enjoys the innovation and agility that AWS provides. Of course, management loves that we don’t have the IT overhead as much for support.”
Shacklock admitted that Scripps would still have gone forward with the switch to AWS even without Boundary, but it would have adversely affected IT and management’s comfort levels.
“Boundary is the only tool out there that provides the network information we need that’s not a hardware solution where we’d have to install it,” Shacklock said. “It wouldn’t have stopped us from moving to AWS, but it definitely helped us sell the move.”
As users have come to increasingly rely on cloud services for their email, entertainment and productivity, uptime has become that much more important. At the same time, handing operations off-site to companies like Amazon has cut down on in-house visibility. While guaranteed 100% uptime may not be possible, Boundary at least gives some of our favorite websites and services the insight they need to react quickly when things go wrong.
Image credits: iStockphoto