Last month, a Friday night storm caused an Amazon Web Services (AWS) outage, knocking out Netflix, Pinterest and Instagram services for many users in the eastern United States. For Netflix in particular, Friday evening is a peak demand time, so customers could not have been too happy about the outage.
Meanwhile, Roundup is one of many genome mapping applications that predicts the evolutionary relationship between genes, organisms, and biological functions. The algorithm Roundup uses is compute-intensive, so the Harvard researchers behind it had to use a combination of Simple Storage Service (S3), Elastic Compute Cloud (EC2) and Elastic MapReduce (EMR), all provided by AWS. To optimize the application, the Roundup team reduced disk I/O, cut its use of in-memory caching and calculated the optimal number of instances it needed. In doing so, Roundup reduced its bill by 40 percent and made sure it could accommodate future computational growth—all without affecting performance.
An application may run properly without optimizations such as those that Roundup did, but performing them may improve associated aspects of the application, including availability, resistance to disaster and, most importantly, the cost of using the public cloud. Here are five specific optimizations you can apply to your application when moving it to the public cloud.
1. Refactor Code to Address Cloud Service Providers' Billing Patterns.
AWS charges not only for computes, storage and network bandwidth used—it also charges every time you access your storage for a read or a write. As a result, you may want to gather up reads and writes in your application and bunch them into single operations wherever possible. That way, once you have spent the money on your own servers, you don't incur additional costs every time you do a read or write operation.
The overall effect of this cloud optimization technique depends upon the pricing methodologies of the public cloud service provider (CSP) you sign up to use. Irrespective of which CSP you sign up with, however, re-factoring can be seen as an opportunity to improve application performance.
2. Optimize Chosen Default Cloud Instances.
When setting up instances with EC2, you can choose among various levels of computes, memory and storage. In addition, EC2 offers Spot Instances, which refers to excess capacity that's available at any time and offered at lesser prices than the normal ones.
It pays to spend some time experimenting with your application in order to determine the optimal level of computes, memory and storage that you need. This will help you make sure that you do not overspend on capacity or configuration, and it will help you figure out if you should consider Spot Instances (or the equivalent offering from another CSP).
3. Balance Service Levels Needed with Default Cloud Instances.
Every applications has its own service level profile—that is, its general purpose and function. Your customer-facing e-commerce site has a different service level than, say, your internal employee portal. Evaluating the costs of public cloud instances against the service levels needed for various applications may help you optimize their public cloud costs.
Think back to the June 29 Netflix outage. Given the nature of the video streaming service, pressing into action another Amazon's data centers elsewhere in the country may not have been feasible, given the storage and bandwidth-intensive nature of Netflix. However, less intensive—and more mission-critical—services can be optimized to be served out of alternative data centers if necessary, making them immune to such outages.
4. Fine-tune Auto Scaling Rules.
Applications that automatically scale the number of server instances, both up and down, offer a great opportunity for optimization. For example, you may have one auto scaling rule that spawns a new instance once CPU utilization reaches 80 percent on all current instances and another that kicks in once average CPU utilization reaches 40 percent.
How do you know that 80 percent and 40 percent are the right numbers? Why not 85 percent and 35 percent? With the latter rule, you would spawn fewer instances and lower your costs.
In addition, applications have varying compute, storage and bandwidth needs. Your rules, then, may need to be based on a complex combination of these three factors and not just on CPU utilization. You may want to experiment with combinations that look logical for your public cloud applications and the service levels they need. You can then optimize these percentages over a period of time.
5. Database Row Optimization.
Applications such as Netflix have a localized nature, meaning that, most of the time, customers access only the data that pertains only to them. Netflix uses AWS' Regions and Zones to host servers that serve customers who live near those data centers.
This is possible thanks to database sharding technology, which lets you partition the rows in your database and store the different partitions in databases that reside in various datacenters. This also applies to applications such as credit card processing, since sharding can be applied to localized patterns of use such as looking up one card owner's transactions or transactions with one merchant.
You don't need to store all database rows in all database instances. If you can partition your database rows and store them in database shards in different instances, you can take advantage of the locality of usage patterns. This will reduce the number of server instances you need and, hence, the cost of your public cloud service.
When you move your application to the public cloud, it may work very well as it is, without any changes. However, if you pay attention to how your CSP charges you and put it in the context of your application's pattern of compute, memory, storage and network bandwidth usage, you can easily reduce your public cloud charges. Optimizing the application itself with some re-factoring may improve its performance and lengthen its life, while experimenting with and fine-tuning your own default instances and auto scaling rules may help you further lower CSP costs.
Nari Kannan is CEO of appsparq, a Louisville, Kentucky-based cloud and mobile applications consulting company. He has more than 20 years of IT experience, starting as a senior software engineer at Digital and subsequently serving as vice president of engineering or CTO of six Silicon Valley startups. Connect with Kannan via email.