In the second part of TNW’s enterprise cloud computing series, we look at whether the Cloud is ready for mission critical apps.
What does “mission critical” mean?
A new era of tech events has begun
We’re back in New York this November for the 4th edition of our growth-focused technology event.
A mission critical application is one that, if it fails, will affect the day-to-day running of the business or damage its long-term viability.
Depending on the nature of your business, reliable and quick access to some apps might be essential for the day-to-day operations, while others contain mission-critical data but a period of downtime could be tolerated.
For example, if a shop’s EPOS system was down for a day it could directly affect turnover, whereas if the accounting software was unavailable for the same period it might merely be frustrating. However, the data which is stored in the accounting systems is mission critical, so if it was lost it would have a serious impact.
So, it’s helpful to think aspects of an app as being mission critical, rather than the entire app itself. For Netflix, for example, it’s vital that the overall application has very high uptime, but it might not be as catastrophic if some films or features were temporarily unavailable. For an accounting app like Freshbooks or Kashflow, uptime is important but the integrity of the data is sacrosanct.
For any application you are considering hosting in the Cloud you should consider if any aspects of it are mission critical, and if so what impact of hosting in a public or private cloud would be
Internal apps vs. public facing apps
Internal apps and public-facing apps can both have mission critical aspects, but the impact can be quite different, and as ever will depend on the nature of the company’s core business.
It would be frustrating for a showroom salesperson if an internal sales system for a high-value item, like a car or dishwasher, was slow. However, it’s unlikely the customer would cancel the sale over a slight delay as they’ve already invested significant amounts of time and going elsewhere would take far longer.
On the other hand, someone booking a test-drive online through a car dealer’s website only has to press the browser’s back button to go to the competition’s website, so even a delay of a few seconds could have a significant effect on enquiries.
And if an EPOS for a sandwich shop was noticeably slow then it could lead to longer queues at peak times, which in turn could start to impact sales and reputation.
If staff are expected to use internal software, such as CRM or reservations systems, then any speed or reliability issues can lead to lower productivity, which can affect the company’s core mission, as well as staff job satisfaction, and lead to higher staffing requirements.
With internal applications, the speed and reliability of the company’s network and connection to the outside world is vital, according to Andrew Taylor of Sage One UK.
“If the app is a resource for internal staff, the existing connectivity available to the Internet may need to be upgraded to accommodate the increased usage,” says Taylor, “If 90% of your workforce accesses a cloud-based CRM solution continually throughout the day, higher latency and lower available bandwidth could be enough to reduce the effectiveness of the move. If the primary internet connection fails; does the backup connection have sufficient capacity to offer the same level of service?”
Rich applications might be sending AJAX requests over secure or insecure connections, which can be slowed up by aggressive firewall rules or networks which have not been optimized for this kind of traffic. Therefore it’s important to test the raw speed of protocols like https, and to measure connection speeds and latency which users are experiencing on cloud-based apps.
Eggs in one basket vs. spreading the risk
Traditionally, corporate IT departments would aim to build an infrastructure out of reliable component parts, and to have enough redundancy so that if one part fails, be it a server or connection, there’s enough leftover capacity for the system to stay up while the technicians identify and resolve the issue.
In theory, a cloud provider should have far more redundancy and excess capacity than most corporates. However, no cloud is 100% reliable, so IT Directors should start thinking in terms of having redundancy of suppliers, rather than redundancy of servers, connections or VMs.
Regardless of whether you use a smaller supplier or a respected solution like IBM Smartcloud or AWS, any corporate who puts all their eggs in one basket needs to be aware they’re taking a risk, as they are almost certain to have downtime, and there’s a risk of data loss.
Half of IT decision makers want 30 minutes of downtime a year, according to Dave LeClair, of Stratus Technologies.
“They’re actually not getting anything close to that in the Cloud,” says LeClair, “They’re actually getting two 9s availability today. What they’re asking for is four 9s or five 9s”.
So why would an enterprise IT director possibly risk putting mission-critical apps on the Cloud?
One answer? Lower upfront costs can mean you spread the risk across suppliers. Buying extra servers and installing additional cabling requires a significant upfront investment if you’re hosting on bare metal, but a cloud setup can be pay-as-you go. Your app can be running on AWS, but if there’s an issue and Amazon’s servers are down you can have it preconfigured on Rackspace or IBM Smartcloud, ready to spin up at a moment’s notice.
Likewise, you can have data periodically backed up across multiple suppliers, so that if there’s a catastrophic issue with your main cloud supplier then your data exists in multiple places. Cloud abstraction layers like RightScale and Scalr, make the task of distributing infrastructure across multiple suppliers easier, while there are also options at the database level of the stack such as Riak, a NoSQL database technology that can automatically distribute data across multiple nodes on multiple suppliers.
Make sure you understand whether your cloud hosting provider is built on top of another company’s stack, though, as many entry-level cloud cloud hosts build their offering on top of another company’s infrastructure. For example, there would be no point in having Heroku as a fallback for EngineYard if Amazon was having difficulties, as both companies’ systems run on top of AWS.
And it won’t come as any surprise to a good IT director that data should always be backed up locally, even if it’s held in multiple places in the Cloud, and if the data’s sensitive then you need to consider encryption and other security measures.
Response times are as important as downtime, according to tech journalist and IT training consultant Les Pounder.
“How quickly your supplier reacts to your downtime is critical for your business,” says Pounder, “Severe incidents require your supplier to work directly with your team to identify, test and resolve your incident.”
So it’s important to make sure your SLA covers this hands-on level of support, potentially with penalties for failing to meet targets.
“The same questions that have been asked of any traditional vendor will always be relevant but having a lawyer assess a contract is standard practice for both models,” says Sage UK’s Andrew Taylor, “The problem will often lie when a clause in a contract becomes an issue; when you’re buying a ‘prescribed’ service there may be no flexibility from the vendor to change the contract for a single customer. Traditional enterprise suppliers will be well versed in contract negotiation but that may be an area where a new ‘cloud-only’ vendor lacks experience.”
“Data security is vitally important too,” adds Pounder, “It forms the basis of your trust with customers, they should know that their data is secure, and their trust helps your business to grow, via image, branding and PR.”
Another area to consider is whether to connect from your own network to your cloud servers over the public internet or using an Ethernet connection. If you are connecting over the Internet, there’s a risk of down-time, a likelihood of inconsistent performance, and nobody you can call to ask why, as well as a risk of intrusion.
While direct connections are not yet universally available, Amazon have introduced both direct fibre connection and hardware to software based VPN, after customers demanded the option, and this approach is likely to become more common over the next few years.
By installing an ethernet circuit between your site and your cloud, you’re removing a lot of the instability. You get higher-performance, lower-latency, higher security, lower latency and you get an SLA.
Of course, any good IT manager knows you should aim to avoid single points of failure, so for truly mission-critical cloud services, it’s worth considering a back-up circuit. Telcos and cable companies in the US (and some other countries) don’t share transport infrastructure, which means that if you install an primary circuit from a cable company and a back up from a telco, then you’re likely to have a rock-solid setup. And of course, there’s always the public internet to fall back on in the unlikely event both providers are experiencing issues at the same time.
Legacy apps vs. greenfield projects
So far, so good for mission-critical cloud applications. If the network speed is optimized, redundancy is considered, and you’ve got a good backup strategy it’s perfectly possible to host a mission-critical application in either a public or private cloud.
But there’s one case where regardless of how good the underlying cloud infrastructure is, moving to the Cloud might not be an option: legacy applications.
“The Cloud is more than ready for mission-critical applications,” says tech journalist Les Pounder, “But the majority of the applications are not ready for the Cloud.
“The Cloud offers organisations the flexibility of scaling and elasticity, all depending on the needs at the time, but it is most likely that legacy applications written in house by organisations are not designed to take advantage of this benefit.
Greenfield projects can be written with the Cloud in mind, which means they can mitigate against some of the disadvantages of enterprise cloud computing and take advantage of the positives. Modern apps are more likely to use APIs and Software Oriented Architecture, which lends itself to being hosted in the Cloud.
However, moving an application which was potentially written years or even decades ago and has run on the same hardware for years might be more problematic. Depending on the quality of the software (and frankly the developers) you might find that the application is tightly-coupled to the exact setup it was developed on. Things to look out for include:
- IP addresses and other configuration options that are hard-coded into the application,
- Servers which have been manually configured without being documented and tested, or ideally automated with a configuration management tool like chef or puppet,
- Data which is pushed onto the server rather than being pulled via an abstracted API,
- Systems running outdated libraries or application servers, as upgrading would break the application,
- Poor deployment workflows, including manual steps and inability to roll-back to previous setups.
In short, if the developers have ever said that the application, “works on our machine,” when deploying, then you might have issues deploying to the Cloud.
Likewise, security is a big consideration when deploying any application to the Cloud and this is especially true for legacy apps, which may have been written with an expectation that a strong corporate network is going to handle security so they don’t need to worry as much about it. Such “Skittles security” apps (hard on the outside, soft on the inside) might need hardening up before a cloud deployment, or may not be suitable at all.
“The realization is that your cloud applications are hosted outside of your corporate network, be this via a public or private cloud,” says Pounder, “Are your applications designed to work in this way, if not can they be altered to do this? What about firewalls, VPN and bespoke network requirements? These areas need to be investigated before making the move to the Cloud, and will require work from your teams to develop and test solutions.”
Of course, not all legacy software will be difficult or dangerous to deploy on the Cloud. If the system has been written in a modular way using practices like TDD, SOA, continuous integration and using automated configuration management, as well as a good understanding of security, then there’s every chance moving from an internal server to the Cloud will be possible and economic.
The only real way to find out whether a legacy application is suitable for deployment to the Cloud is to trial a migration. If you find yourself hitting lots of hurdles early on it may rule out a cloud deployment.
You’re in good company
There’s definitely a lot to consider if you’re thinking of moving a mission-critical app to the Cloud, but luckily you’re in good company. 1 in 3 mission critical apps are already on the Cloud, and this is set to rise to half by 2015.
“The Cloud marketplace is expanding rapidly, most vendors now have a cloud offering of some description,” says Sage UK’s Andrew Taylor, “The Cloud is already hosting mission critical applications for thousands of companies and their customers.”
“The IT industry has always been a rapid mover, I think 2014 will see a further increase in confidence in cloud and even greater uptake. Enterprises usually work on a 3 to 5 year cycle, so as those cycles continue to come to an end we’ll see more and more hybrid implementations where customers choose to test the water.”
In the end, the question isn’t whether the Cloud is ready for mission critical apps (it is), it’s whether it’s ready for your mission critical apps. And that will depend on your IT department, developers and the quality of your legacy software.
Don’t miss part one: 5 reasons enterprises are frightened of the Cloud