As we quickly approach the start of GLUE Conference 2013, I wanted to take a minute to sit down with one of GLUE’s keynote speakers, Adrian Cockcroft of Netflix. Since GLUE is all about the ties that bind when it comes to the Internet (cloud, APIs, backends, full-stack development, et al.) it made sense to talk to one of Netflix’s Cloud Architects about how the company has changed as it has grown.
Cockcroft spoke at last year’s GLUE as well, where he filled a half-day tutorial class on cloud architecture to the point of having to turn folks away. But, according to Cockcroft, what he was really speaking about was the abstract architecture of Netflix in some detail, as the company hadn’t finished implementing all of the code. This year, his keynote and tutorial will focus on the practical details about the code and how to use it.
For Netflix, the best way to get others to get involved is by taking the open-source approach. Since Netflix itself was using open-source code from Cassandra, and contributing fixes to it, the company was already comfortable within the open-source community. Toward the end of 2011, the company began contributing standalone projects such as Curator and ZooKeeper.
As time has moved on, the benefits of living within open-source have continued to push Netflix in its use of the platforms. For Cockcroft, those benefits can be narrowed down into four key areas:
- Validation of Approach — The code is extended, and made better, when other pople work on it.
- Contribution Ecosystem — The community finds bugs, and it builds what Netflix didn’t think of itself.
- Quality Control — Open source is, simply, better quality code.
- The People — Applicants to the company already understand its architecture, making Netflix a better place to work.
We talked a bit as well about Netflix’s reliance on AWS. For Cockcroft, despite the occasional problems, it’s still a better option than hosting via datacenters and in fact he states that the problems are still the same. The advantage, of course, is that provisioning takes minutes instead of weeks. Beyond that, however, there’s the fact that it’s now easier for Netflix to do one-off projects.
Netflix’s redundancy focuses on deploying everything to three different copies, in different time zones and buildings. While we have taken a look at the Chaos Monkey, Cockcroft tells me that there’s a bigger threat in the company’s internal tool dubbed The Chaos Gorilla. While the Monkey may take out one or two services, the Gorilla will actually take down an entire zone.
“Every time we do that, we find something that is not quite right; a weakness or a bug. You can test to pass or you can test to fail. We make sure that we push our systems into failure so that we know how they react.”
So what’s next for Netflix? We’re told that the company will continue to push itself toward failure mitigation on a regional level. The goal is to have a 50/50 split of Netflix services, from coast to coast in the US, so that when the next Hurricane Sandy hits, Netflix customers won’t feel an impact on their services. The company is also awarding prizes for contributing to its open-source platform. There are 10 prizes at $10,000 each, and all of the code is readily available on the Netflix Github site. So far over 700 people have forked the project, and the winner will be chosen in December.
Cockcroft is only one of 13 keynote speakers at GLUE, and that’s not to mention the breakouts, tracks and workshops that will be available, included within the cost of your ticket. If you’re one of the people who build the Internet, then there’s positively no conference that will provide more information than GLUE. Come, join us in Broomfield as we talk about what folks will call bleeding edge two years from now. Transport yourself into the future, and let’s build something amazing.
Use code TNW321 for 10% off of your registration.
Image: AFP/Stringer / Getty Images