The cloud conversations happening in boardrooms right now are almost entirely about growth: AI workloads, global expansion, faster release cycles. The conversations that should be happening are about the $182 billion that disappears every year before any of that growth delivers value.
This is not speculation. The data from Flexera, Gartner, FinOps Foundation, and HashiCorp for 2025 and 2026 tells a consistent story. Cloud spend is scaling faster than cloud discipline. And the gap between the two is widening.
The $723 Billion Reality Check
Gartner projects global public cloud end-user spending at $723.4 billion in 2025 — a 21.5% increase year-over-year. Forrester puts the 2026 figure at $1.03 trillion. For context, total worldwide IT spending in 2026 is forecast at $6.15 trillion, meaning cloud will account for roughly one in every six dollars spent on technology globally.
The provider landscape has stabilized at the top. AWS holds 30% market share, Azure 20%, and Google Cloud 13% as of Q2 2025. Together, they control 63% of global cloud infrastructure revenue. But the growth rates tell the more interesting story: Azure grew 39% year-over-year in Q2 2025, Google Cloud 32%, and AWS 17.5%. The gap between AWS and the challengers is narrowing.

Figure: AWS, Azure, and Google Cloud market share and growth rates driving $723B global cloud spend
AI is the primary growth accelerator. 72% of organizations now use generative AI cloud services, up from 47% just one year ago. AI and analytics workloads represent 18% of all cloud infrastructure spending. GenAI-specific cloud services expanded 140-180% in Q2 2025 alone. The cloud is getting bigger faster than it has in years, and most of that acceleration runs through GPU-heavy infrastructure in three data center providers.
The Waste Layer Nobody Talks About
Here is the number that should be in every quarterly business review: 27% of cloud spend is wasted.
Flexera has measured this figure for three consecutive years. At 2025 spending levels, 27% of $723 billion is approximately $182 billion. That is not an inefficiency. That is a structural problem, and it compounds as organizations scale because visibility does not grow at the same rate as infrastructure.

Figure: $182B in annual cloud waste broken down by category idle compute leads at 35%
Idle compute is the largest category, and the numbers behind it are striking. Datadog’s 2024 analysis found that 65% of EC2 instances average below 20% CPU utilization over a 30-day window. These instances are running, billing, and doing almost nothing. The same pattern appears in containers: Kubernetes clusters average 10% CPU utilization and 20% memory utilization. That means over 80% of container spend goes to reserved capacity that sits empty.
91% of enterprises report wasted cloud spending. 75% say it is getting worse, not better. Only 23% of organizations consider themselves highly efficient at cloud management. These are not outliers struggling with unusual scale. They are the median.
The mechanism is straightforward: teams provision for peak load, but most systems run at average load most of the time. No one owns the cleanup. The billing keeps running.
Non-Production: The Quietest Budget Leak
Development, test, and staging environments represent approximately 27% of total cloud infrastructure costs. That is a significant portion of the bill for infrastructure that does not serve a single user in production.
The problem is not that these environments exist. It is when they run.
A typical engineering team uses its development environment roughly 6-8 hours per day, five days per week. That is about 40 hours of actual usage per week. But the environment runs 168 hours per week. It bills for every one of those hours.
| Environment Type | Hours Used/Week | Hours Billed/Week | Idle Hours | Idle % |
|---|---|---|---|---|
| Dev (engineer local) | 40 | 168 | 128 | 76% |
| QA / Test | 30 | 168 | 138 | 82% |
| Staging | 20 | 168 | 148 | 88% |
| Demo / Sandbox | 10 | 168 | 158 | 94% |
The idle percentage across non-production ranges from 76% to 94%. Weekends alone represent 48 idle hours per week. For a staging environment running a few hundred dollars per day, that is $600-$800 in weekend spend with zero users and zero workloads.
This category is also the most tractable. Non-production environments have predictable usage windows, clear team ownership, and no production risk when their hours are constrained. The path to recovering this spend does not require a multi-quarter architecture project. It requires knowing when environments run and stopping them when they do not need to.
=> NielsenIQ achieved 60-80% savings on non-production Kubernetes clusters by applying this principle systematically. The infrastructure did not change. The billing window did.
AI Is Accelerating Both the Spend and the Problem
The AI wave driving cloud growth has a utilization problem nobody is talking about yet.
GPU utilization in cloud environments averages 23%. That means 77% of the most expensive compute category in modern infrastructure sits idle. A single NVIDIA H100 instance on AWS costs between $25,000 and $98,000 per year, depending on commitment. Running one at 23% utilization means paying for a full year to use it for roughly 11 weeks of actual compute.
The spend is growing faster than the visibility. 72% of organizations use GenAI cloud services, but only 63% of FinOps practitioners are tracking AI spend and that 63% figure represents a 2x increase from the prior year (31%). A third of the organizations spending on AI cloud services have no systematic view of what those services cost.

Figure: 72% of orgs adopt GenAI but 37% have no AI cost visibility — blind GPU scaling compounds the problem
The implication for 2026 is a cost reckoning. Organizations that scaled AI workloads aggressively in 2024 and 2025 without building corresponding cost visibility are heading into budget conversations where the AI line item is large, opaque, and hard to defend. The fix is not reducing AI investment. It is applying the same utilization logic to GPU infrastructure that good engineering teams already apply to application servers.
FinOps: Growing Fast, Maturing Slowly
The organizational response to cloud waste exists. FinOps is real, growing, and increasingly board-level. 59% of organizations are expanding their FinOps teams in 2025. Cost efficiency became a stated priority for 87% of organizations this year, a 22-point increase from 2024. Deloitte estimates that FinOps tools and practices will save enterprises $21 billion in 2025 alone.
But the maturity data tells a different story.
| FinOps Maturity Stage | Definition | % of Organizations |
|---|---|---|
| Crawl | Basic tagging, some visibility, reactive spend reviews | ~34% |
| Walk | Consistent tracking, some automation, cross-team visibility | ~51% |
| Run | Proactive optimization, policy automation, full attribution | ~14% |
Only 14.2% of FinOps practitioners operate at “Run” maturity. More than half are at “Walk,” which means they have dashboards and they look at them, but the actual optimization work is inconsistent and manual. The gap between “we have a FinOps team” and “FinOps is changing our infrastructure spend” is wider than most organizations admit.
The barriers are structural, not technical. Multi-cloud visibility is one: 92% of enterprises use multiple cloud providers, but only 39% track unified spend accurately across those clouds. Accountability is another: 70% of organizations are unsure where their cloud budget actually goes at the team or workload level.
What the Data Says Actually Works
The data on optimization outcomes are consistent enough to draw real conclusions.
| Strategy | Typical Savings | Timeline | Notes |
|---|---|---|---|
| Non-production environment scheduling | 20-40% of infra costs | Under 2 weeks | No migration, no arch changes |
| Reserved instances / savings plans | 37-72% vs. on-demand | Immediate (after commitment) | Requires utilization forecasting |
| Rightsizing idle compute | 5-15% of total bill | 2-4 weeks | 65% of EC2 instances are candidates |
| Structured FinOps program | 25-30% of monthly spend | 3-6 months | Requires cross-team ownership |
| Kubernetes workload optimization | 40-80% on non-prod clusters | 4-8 weeks | Higher savings on non-production |
| Centralized governance and policy | 33% reduction in inefficiencies | 6-12 months | Most effective at enterprise scale |
WPP saved $2 million in the first three months of FinOps deployment, reaching 30% annual reduction in cloud spend. COMPLY recovered $460,000 in eight months. These are not exceptional outcomes; they are what happens when organizations apply systematic visibility to infrastructure that was previously provisioned without it.
The common thread across every successful optimization case is sequence: visibility comes before action. Organizations that try to optimize without first understanding where spending actually sits end up cutting the wrong things or creating new problems. The teams that achieve 30-40% reductions do it by spending the first few weeks mapping spend to workloads and owners, then working from the highest-waste categories down.
Cloud spend in 2026 is heading toward $1 trillion. The infrastructure itself is not the problem; the problem is that $270 billion of that trillion will likely disappear before it delivers any value. The organizations that will look back on 2026 as the year they got cloud right are the ones who treat that $270 billion not as an inevitable cost of doing business, but as the clearest optimization target on their balance sheet.