Commitment discounts are complex and require large investments of time and money. Automation can unlock those savings with minimal effort or risk.
For as long as the public cloud has existed, users have been trying to figure out new ways to optimize their spend and prevent their cloud costs from spiraling out of control. Optimizing areas like your storage or database usage can certainly help, but the greatest opportunity to lower your public cloud costs lies in optimizing your compute spend. Compute costs typically represent anywhere from 50-80% of your overall cloud bill, so it goes without saying that this is the biggest objective to chase in the quest for reducing your overall public cloud costs.
The good news is that public cloud providers like Amazon Web Services (AWS) and Google Cloud Platform (GCP) offer significant compute discounts for users that are willing and able to commit to a set amount of usage over a 1- or 3-year term (naturally, the longer the commitment, the greater discount you can receive).
The bad news is that taking advantage of these commitment discounts involves two primary challenges:
- Accurately forecasting your compute needs for the next one or three years
- Managing your usage for the duration of the commitment to ensure that you are hitting your targets
These hurdles can be particularly difficult for digital-native businesses that are still young and scaling up their cloud-based systems and applications, as they not only lack either the resources or the runway to take on the risk and management challenge of investing in a commitment portfolio, but they also have little idea as to what their needs will be in three months, let alone three years. These challenges can also create headaches for larger companies with infrastructure and commitment needs distributed across multiple teams that might struggle with managing the cost of all of them.
Forecasting your compute needs
Depending on how your organization operates, a single team might be responsible for public cloud infrastructure forecasting, or the responsibility may be shared across several different teams based on their own specific projects and needs.
However it’s structured, committing to a set level of compute usage over a long period of time carries significant risk: If you overprovision your commitment, you risk wasting money on unused compute instances; if you underprovision, you risk paying premium prices for on-demand instances.
You can also get larger discounts from the cloud providers by being more precise with your compute commitments, such as identifying specific machine types or regions that you are willing to commit to. However, it’s also very important to keep flexibility in mind when purchasing commitments. Public cloud environments innovate at a rapid pace, so as your software or business model changes, you may need to reconfigure your environment. If that happens and you haven’t purchased a commitment that offers that type of flexibility, you could be forced to eat that cost.
Before we look at the types of commitments you can purchase and the varying levels of flexibility they offer, let’s first consider what should be top of mind when forecasting your compute needs:
- Internal scope
Who will be leveraging this commitment? Do you have multiple teams within your DevOps organization that will be sharing it, or is it better to purchase commitments for each individual team?
- Commitment length
How far in the future are you willing to commit? If you are confident in your usage needs and have consistent specifications, you can consider signing a three-year commitment for a baseline of whatever you forecast (e.g. 50% of your projection) to maximize your savings. You can then fill the remainder with a combination of 1-year commitments and/or on-demand instances.
- Services
Do you need just simple infrastructure-as-a-service compute (i.e. EC2 or GCE)? Will you be using containers? Serverless? Kubernetes? If so, can these all be covered by the same commitment, or should they be spread out across separate commitments? Keep in mind that spreading them out can lead to increased flexibility but could also add to the management burden.
- Machine types
What machine types and sizes will your teams require to build your digital offering? And is there a chance that these needs will change over the course of the next one or three years?
- Regions
Determining the regions where you need machines spun up is probably a fairly simple exercise in the immediate term, but this can become a burden as your business or user base expands into new markets. You need to determine whether or not your regions might change. If they do, will your commitment provide that flexibility, or will you need to purchase additional ones to account for that growth?
Once you have answered these questions, you need to determine which type of commitment(s) to buy. Both AWS and GCP have multiple types of compute commitment plans that offer varying levels of flexibility. They can generally be broken down into two groups:
- Resource-based commitments require a certain amount of usage based on specifications that you determine in advance. On AWS, these take the form of Reserved Instances (RIs) or Convertible RIs. On GCP, they’re called Committed Use Discounts (CUDs).
- Spend-based commitments allow you to commit to a certain level of spend regardless of resource specifications. This added flexibility means smaller discounts. These are classified as Savings Plans (SPs) on AWS, and FlexCUDs on GCP.
As you see on the chart above, the public cloud(s) that you’re building on and the varying levels of flexibility that they provide all represent a bevy of different options for you to try to determine which is most suitable for your needs.
For AWS users, adding to the complexity is the possibility of reselling Standard RIs on the AWS Marketplace to recoup the cost for any unused ones that you might have as a result of overprovisioning. However, there’s obviously no guarantee that you’ll be able to find a buyer for those specific workloads, and even if you do, the process of selling them adds another layer of complexity and time investment for whomever has that responsibility on your team.
Managing and tracking your commitments
Now that you’ve forecasted your compute needs and purchased some commitments based on those projections, your task is far from over. The management of your commitment portfolio is just as important as your pre-purchase forecasting, if not more so. That’s because no matter how well you forecast and provision, your environment will almost always require some changes, or new needs will crop up that require the purchasing of additional commitments for your portfolio.
With that in mind, let’s dive into some of the things that you need to be aware of as you track your commitments throughout their lifecycles:
- Balance between commitments and on-demand workloads
We touched on this briefly in the forecasting section above, but as your compute needs grow, you need to determine how much can be covered by 1- or 3-year commitments, and how much can be left to on-demand purchases. - Regional expansion
Is your business or service offering growing? Are you trying to reach users in new markets? If so, unless you’re already covered by a spend-based commitment (i.e. a Compute SP on AWS or FlexCUD on GCP), then you’ll likely need to purchase new commitments to cover those regions. - Ongoing tracking and monitoring of usage
It’s important to know whether or not you’re on track to hit your usage targets throughout the lifecycle of a commitment. This can be difficult to determine if your usage is inconsistent throughout the term, either due to fluctuations in your user base or seasonality that’s inherent to your business model. Either way, you’ll want to know if you’re going to exceed your provisions and possibly be hit with overage fees (or buy new commitments to cover the extra), or whether you’re going to have unused workloads when the term is complete. - Renewals
As you track your commitments, you’ll need to determine what to do with that commitment when it runs out, and whether you want to buy a new one, reconfigure it, or let it expire completely. This naturally becomes a bigger challenge as your commitment portfolio grows and you have staggered renewal and expiration dates throughout the year.
Where’s the easy button?
If your head is spinning at this point with all of these factors to consider, you’re certainly not alone. Managing a commitment portfolio can be such a burden and comes with such pronounced risks that many companies don’t even bother with them, choosing instead to rely exclusively on on-demand workloads despite the elevated cost involved.
However, you can automate your compute commitments and remove both the risk and management overhead in the process. DoiT Flexsave™ was built for this exact purpose. Using machine learning, Flexsave analyzes your ongoing compute spend to identify any AWS workloads that are not already covered by existing discounts (i.e. SPs, RIs, Spot, or Enterprise Discount Programs) and then automatically applies the equivalent of a 1-year Savings Plan to those on-demand workloads.
“Without Flexsave, we probably wouldn’t be able to use commitment-based discounts at all; now we can get the savings benefits with almost no effort whatsoever.” –Kyâne Pichou
This method has generated millions of dollars in savings across hundreds of Flexsave customers over the past few years, including the ecommerce platform Phenix, who has saved over 25% on their on-demand compute workloads since enabling Flexsave. “Without Flexsave, we probably wouldn’t be able to use commitment-based discounts at all; now we can get the savings benefits with almost no effort whatsoever,” says DevOps leader Kyâne Pichou. “We just turned it on and were able to forget about it, which allows us to focus on building out the other functionalities within the Phenix platform.”
These 1-year discount rates can dramatically lower your on-demand compute spend, and because Flexsave works alongside your existing commitments, you don’t have to worry about losing out on discounts that you’re already receiving. As such, many DoiT customers have continued to purchase 3-year commitments for part of their compute needs to maximize their savings, and then let Flexsave cover the rest with the 1-year equivalent.
As with DoiT’s other products and services, Flexsave also costs nothing to use and can be set up quickly and easily with no code changes or downtime in your environment.
To learn more about Flexsave or other DoiT-recommended cloud cost optimization strategies, get in touch with an expert today.