11 Ways to Optimize Cloud Spend
(Opinions and guidance are wholly my own.)
Both Microsoft and AWS publish pillars of “good” cloud architecture in their respective Well-Architected Frameworks. These frameworks include key tenets like reliability, security, and operational excellence. However, after working with dozens of companies across Azure and AWS, I can confidently say that one pillar stands above the rest in the minds of users—cost optimization. When asked about organizational priorities, decision makers might mention their RTO/RPO or compliance requirements they must meet, but invariably they give cost as their top priority.
In the last year, this trend has continued to accelerate as cloud users look to reign in spending. In this post, I'll briefly explore some ways to reduce spend, without sacrificing performance or resilience. For some of the more complex recommendations, I may expand them into standalone posts giving detailed guidance.
If your company is interested in cloud cost optimization, I would love to hear from you! Please send questions, feedback, or comments to hi@willfox.dev
Move Unused Data to Cool or Archive Storage
Companies are collecting data at an unprecedented rate, but often only a fraction of this data is used for analysis or business intelligence. Vast quantities of data for logging, compliance, backup, and more is collected but rarely accessed.
Consider moving this data to the Cool or Archive storage tiers which offer significantly lower costs per GB. Note that these tiers have higher data retrieval costs, so be sure that your data is infrequently accessed before moving it. On the other hand, if you have data in a Cool tier which is frequently accessed, it may be more cost effective to move to a Warm or Hot tier where you pay more for storage, but less for access and retrieval. By optimizing your data tiering, you can significantly reduce cloud storage costs without deleting valuable data.
Right-Size Virtual Machines and Databases
It can be challenging to predict resource requirements for an application or database, and it is often safer to over-provision than under-provision. For this reason, cloud environments often contain more compute than needed to keep an application up and running.
Consider “right-sizing” these services by looking at historical usage data to determine the necessary resources for the environment. Companies can often reduce compute and licensing costs by simply scaling down to smaller instance sizes without sacrificing performance or availability.
Auto-Scale Resources
Many companies want to right-size their environments but are worried about handling spikes in traffic or cyclical usage patterns.
In these cases, consider setting up auto-scaling or serverless resources which can adapt to match the load on your applications. This can be done for VMs with “auto-scaling groups.” Some resources like Amazon’s Aurora and Azure’s SQL Database also offer “serverless” auto-scaling compute. Note that the serverless options can be tempting, but be aware that they come with a cost premium compared to standard, pre-provisioned compute.
Turn Off or Downsize Environments During Off-Hours
Few companies require all of their resources to be running all of the time. More commonly, they have employees and/or customers concentrated in one part of the world, or certain applications that are used monthly or quarterly.
Consider downsizing or deleting these environments when they aren’t in use. As an example, suppose your company has employees based only in North America. You could save on cost by downsizing development and test environments outside of US working hours, as few people will be interacting with them at those times. Or suppose you have an application that handles customer invoicing at the end of each month. These resources should not be at full capacity for the entirety of the month. They should be scaled down until they are needed.
Upgrade Compute to Latest Versions
Chip-makers like Intel and cloud providers like Amazon and Microsoft are continuously working to improve their offerings. These providers are regularly launching new generations of their compute resources for VMs, databases and more, featuring higher performance hardware often at similar cost to a previous generation.
Consider moving any applications running on older hardware to the latest generation. Newer versions offer significant price-performance improvements, which may help save on cost or give you the option to downsize, running the same workloads with fewer resources.
Adopt ARM Processors for Compatible Workloads
The Arm-based processor represents the most significant advancement in chip design in recent years. Since its debut, it has seen growing adoption due to its price performance over comparable x86 chips. Both AWS and Azure have introduced their own Arm offerings (Graviton and Ampere, respectively) for a variety of services including VMs, databases, data analytics, and data science.
Consider moving your workloads onto machines with Arm-based processors to take advantage of the price performance gains. You may be able to downsize resources to save on cost. Do note that not all workloads are compatible with these new processors, most notably Windows (though this is likely changing), so be sure to test for compatibility before moving.
Deploy to Lower-Cost Regions
Cloud services often vary in cost depending on what region you deploy them to. Providers set prices based on their costs to run a given region, as well as the demand for services in that region. The result can be cost differences as much as 2x from one region to another. (E.g. the Brazil regions are notoriously expensive, while most US regions are some of the cheapest.)
Consider deploying resources into a lower-cost region. There are a variety of considerations when choosing a region including latency and performance, compliance standards, data transfer charges, and more. However, deploying suitable workloads to cheaper regions can result in significant cost savings.
Delete Unused Resources
Cloud providers make it simple to spin up new resources whenever they are needed. In practice, these resources are not always deleted at the end of their lifecycle, leading to unnecessary costs. Some companies pay for VMs which have little to no utilization. Others are paying for disk storage, despite the fact that machine associated with that disk has long been deleted.
Consider doing an audit of your cloud environments to discover resources with little to no utilization. These might include VMs which were created for testing, but not removed. It could be queues which have no messages passing through them. It might be storage which is rarely or never used. Once you have identified these resources, check with resource owners to ensure they are safe to delete. Removing unused resources can result in a cleaner environment and cost savings.
Modernize to Cloud-Native Services
Commercially licensed software can be very expensive. For many companies running SQL Server or Oracle databases, for example, the software license often costs more than the actual resources required to run the database (compute, storage, and networking). Meanwhile, the Open Source community has made huge strides with technologies like Linux, Kubernetes, PostgreSQL, MySQL, and more. Open Source Software is now widely adopted in the enterprise space, and is considered safe ground for most companies.
Consider creating a plan to modernize any applications which are reliant on commercial software, in favor of Open Source alternatives. Moving an app from Windows Server to a Linux or migrating data from a SQL Server instance to PostgreSQL can result in significant cost savings without affecting application performance, availability, or security.
Downgrade Commercial Software
Many companies rely on commercial software but are not interested in moving to Open Source alternatives—perhaps due to existing investments in licenses, app design, or personnel.
If modernization is not in the short-term roadmap, consider downgrading to a lower tier of the commercial software for relevant workloads. For example, Microsoft offers several “Editions” of SQL Server software. For dev and test workloads, consider using SQL Server Developer Edition, which is free to run. For workloads which don’t require the full capability of SQL Server Enterprise, consider downgrading to the Standard Edition to achieve the same outcome. Save on license costs by using the right license for the job, rather than defaulting to an expensive “enterprise” tier.
Reduce Data Egress Charges
For companies running in Hybrid or Multicloud configurations, data transfer charges can grow quickly. Major cloud providers generally allow data to enter their cloud environment for free, but charge a per GB fee for data leaving the cloud.
Companies who move data regularly between cloud environments or from cloud to on-prem, consider using services like AWS’s Direct Connect, Azure’s ExpressRoute, and Megaport’s Cloud Router to connect these environments privately. When joined in this manner, the per GB data transfer fees can be reduced by as much as 80%.