token

The Token Treadmill

Manny HenriMay 4, 2026

The AI boom is driving unprecedented spending on cloud infrastructure—but the revenue, margins, and physical resources needed to sustain it are lagging far behind. Beneath the hype, a fragile system of token-driven economics, debt-funded expansion, and circular financing is emerging, raising questions about how long the momentum can last.

Or...How the Cloud AI Industry is Sprinting Toward a Wall

Introduction: A Boom Built on Borrowed Time

In the first quarter of 2026, three of the world’s largest cloud providers — Microsoft, Alphabet, and Amazon — collectively spent $112 billion in a single quarter on AI infrastructure. By the close of fiscal year 2026, the four U.S. hyperscalers are projected to spend a combined $725 billion in capital expenditures, a 77% increase over 2025’s already-record $410 billion. Add in Oracle’s $50 billion target, and the figure climbs north of three-quarters of a trillion dollars in a single year, on infrastructure to serve a technology whose end-customer revenue, by every credible measure, is not even close to justifying it.

The headlines tell a story of unstoppable momentum. Beneath them, a quieter and more disquieting story is forming: a token-driven cloud AI industry whose unit economics are deteriorating, whose financing has turned visibly circular, whose physical inputs — chips, memory, electricity, water — are running into hard ceilings, and whose anchor customers are still bleeding cash at staggering rates.

This is the story of an industry sprinting toward a wall.

Part 1: The Spending is Historically Unprecedented — and Increasingly Debt-Funded

Begin with the raw numbers, because the scale of capital deployment is genuinely without modern parallel.

CreditSights projects capex for the top five hyperscalers will rise from roughly $256 billion in 2024 (already up 63% year-over-year) to about $443 billion in 2025, and then to roughly $602 billion in 2026. Roughly 75% of that 2026 spend — about $450 billion — is directly tied to AI infrastructure: GPUs, data centers, networking, and power.

By individual company:

• Amazon has guided to roughly $200 billion in 2026 capex, with the company itself disclosing in an SEC filing that it may seek to raise both equity and debt to fund the buildout. Morgan Stanley analysts now project Amazon will swing to negative free cash flow of nearly $17 billion in 2026, while Bank of America puts the deficit closer to $28 billion.

• Alphabet raised its 2026 capex guidance to $180–$190 billion, more than double its 2024 level. Pivotal Research projects Alphabet’s free cash flow will collapse roughly 90%, from $73.3 billion in 2025 to about $8.2 billion in 2026.

• Microsoft is tracking toward $190 billion in calendar-year 2026 capex, well above the consensus analyst estimate of $152 billion. Barclays sees Microsoft’s free cash flow falling 28% in 2026.

• Meta has guided to $125–$145 billion in 2026 capex. The company’s stock fell roughly 6% on its forecast, with management citing higher component pricing and additional data center costs.

Capital intensity — capex as a share of revenue — has reached 45% to 57% across these companies, a level economic historians simply do not have analogues for in mature, profitable software businesses.

The financing model has also quietly shifted. For two decades, hyperscalers funded growth almost entirely from operating cash flow. That has broken down. Bank of America forecasts hyperscaler debt issuance will reach $175 billion in 2026, more than six times the $28 billion annual average of the prior five years. Alphabet sold a rare 100-year "century bond" — the first by a tech company since Motorola in 1997 — as part of a $32 billion debt offering. Amazon raised roughly $54 billion in March 2026 alone. Big tech companies issued more than $100 billion of bonds in the first months of 2026 to fund AI capex, and investors demanded record protection via Credit Default Swaps.

This is no longer a self-financing industry. It is an industry leveraging up.

Part 2: The Revenue Doesn’t (Yet) Match the Bill

Spending of this magnitude would be unremarkable if the revenue were comparable. It is not.

The most-cited figure inside the industry is sobering: AI-related cloud services generated approximately $25 billion in revenue in 2025 — roughly 10% of what hyperscalers spent on infrastructure that year. The gap between AI infrastructure investment and AI infrastructure revenue is, by an order of magnitude, the largest such gap in the history of enterprise computing.

Consultant data corroborates the disconnect. By multiple industry tallies, only about 25% of enterprise AI initiatives have delivered their expected ROI, and fewer than 20% have been scaled across entire enterprises. A National Bureau of Economic Research study published in February 2026 found that despite 90% of firms reporting no measurable impact of AI on workplace productivity, executives nonetheless projected AI to increase productivity by 1.4% — a textbook productivity-paradox setup.

Even the most-favored AI companies are still deeply unprofitable:

• OpenAI is generating roughly $2 billion per month in revenue ($24 billion annualized as of April 2026) but is burning approximately $17 billion in cash in 2026. Internal documents project a $14 billion loss for 2026. The company has committed over $1 trillion to infrastructure over the next several years and does not project positive free cash flow until 2029. Cumulative projected losses through 2028 exceed $44 billion.

• Anthropic has reached roughly $30 billion in annualized revenue (up from about $1 billion at the start of 2025), but is also operating at substantial loss as it pours capital into compute capacity.

• CoreWeave, the publicly traded AI cloud "neocloud," grew revenue 110% year-over-year in Q4 2025 — and still operated at a loss for the full year, with technology and infrastructure costs consuming a major share of revenue.

As technology strategist Jac Arbour, CEO of J.M. Arbour Wealth Management, put it: "The biggest untested assumption in the 2026 AI narrative is that today’s valuations are justified by fundamentals that have yet to materialise."

OpenAI CEO Sam Altman himself, in a now-frequently-quoted interview with The Verge, acknowledged: "Are we in a phase where investors as a whole are overexcited about AI? My opinion is yes." Google CEO Sundar Pichai conceded to the BBC that "there are elements of irrationality" in the AI market right now.

When the executives running the boom are publicly invoking irrationality, that is data.

Part 3: The Token Economy Has a Margin Problem

The deepest structural issue is rarely addressed in hyperscaler earnings calls but is well known to any CFO running an AI-native product: token economics break the SaaS playbook.

Traditional SaaS enjoyed near-zero marginal cost. Once a seat was provisioned, the next user cost essentially nothing to serve. Gross margins of 80% were the norm and the foundation of every public-software valuation multiple of the past two decades.

AI changes that. Every prompt and every response consumes tokens that map directly to compute the provider must pay for. According to a16z’s published benchmarks, AI SaaS companies are typically operating at gross margins of 50–60%, well below the 60–80% standard for traditional SaaS, and the gap has widened as reasoning-heavy workloads have proliferated.

Two trends compound the problem:

Per-token prices are falling — but token consumption is rising faster. Epoch AI data shows GPT-4-class performance that cost about $20 per million tokens in late 2022 now runs around $0.40 — roughly a 10x decline per year. But total AI spend at companies in production has gone up, not down, over the same period. Reasoning models like OpenAI’s o-series, GPT-5, and Claude’s extended thinking modes generate thousands of internal "thinking" tokens before producing a final answer. As one industry analysis put it, the situation resembles building more fuel-efficient engines and then using the efficiency gains to build monster trucks.

"Inference whales" are eating margins alive. Business Insider has reported on platforms discovering single users consuming tens of thousands of dollars worth of compute under flat-rate plans. One developer reportedly consumed over $35,000 in compute under a $200/month plan. Cursor users have exhausted credit allotments within days. Replit was forced to introduce "effort-based pricing" to contain runaway usage. Anthropic’s own "Max Unlimited" plan saw some users consume 10 billion tokens in a single month — equivalent to processing 12,500 copies of War and Peace.

The implication: AI products are not simply lower-margin SaaS. They are a different category of business entirely, one in which the marginal cost of the next request is meaningfully nonzero, and one where heavy users can structurally destroy the unit economics if pricing isn’t perfectly calibrated.

For the cloud providers underneath, the picture is worse. Microsoft, AWS, and Google sell GPU capacity that they themselves bought from Nvidia at margins approaching 80%. The gap between what hyperscalers pay for GPUs and what they can charge customers — net of power, cooling, networking, real estate, and depreciation on rapidly obsolescing hardware — is the actual economic engine of this industry. And there is little public evidence that net AI margins at the hyperscaler level are durable, especially as Nvidia keeps capturing the lion’s share of value upstream.

Part 4: The Resource Walls Are Real

Even setting aside the financial questions, the AI cloud industry is running into physical limits that capital alone cannot solve.

Power

Gartner has projected that power shortages will restrict 40% of AI data centers by 2027. The pipeline of new data centers under construction in the United States, if all currently planned facilities are completed, would push U.S. data center power consumption from less than 15 GW today to many multiples of that — after two decades in which total U.S. power demand grew at well under 1% per year.

Microsoft has disclosed an $80 billion backlog of Azure orders that cannot be fulfilled due to power constraints. The hyperscalers themselves now report that their markets are supply-constrained, not demand-constrained. That sounds bullish until you realize what it actually means: the limiting factor in this industry is no longer customer demand, but the ability to physically energize infrastructure.

The political consequences are compounding. Rob Gramlich, president of power consulting firm Grid Strategies, told CNBC: "I don’t think we’ve seen the end of the political repercussions. And with a lot more elections in 2026 than 2025, we’ll see a lot of implications. Every politician is going to be saying that they have the answer to affordability and their opponents’ policies would raise rates." Utility bill increases tied to data center buildouts are already a live political issue in Virginia, Arizona, Ohio, and beyond.

GPUs and HBM Memory

Nvidia’s H100 1-year rental contract pricing has risen roughly 40% from a low of $1.70 per GPU per hour in October 2025 to $2.35 by March 2026 — for the previous-generation chip, in defiance of the normal pattern in which older silicon depreciates as new chips ship. Blackwell GPU delivery schedules now extend into mid-2026 and, for many enterprise buyers, into Q1 2027.

The deeper bottleneck is high-bandwidth memory. HBM is produced by only three suppliers — SK Hynix, Samsung, and Micron — using yield-limited specialized processes. TSMC’s CoWoS advanced packaging capacity, required to bond HBM dies onto GPU substrates, is fully allocated through at least mid-2027. Total HBM demand has grown 5x between 2023 and 2026. New fabs take 18 to 24 months to come online. The shortage is structural, not transient.

Water and Land

Data centers in arid regions are facing escalating disputes over water for cooling, with active permitting battles in Arizona, Texas, and Spain. Land for hyperscale campuses near accessible transmission has become the third scarce resource, after chips and electrons.

Capital can build a data center. It cannot generate a megawatt that the grid does not have, manufacture HBM that the fabs cannot yet produce, or conjure water from a depleted aquifer.

Part 5: The Circular Financing Web

Perhaps the most uncomfortable feature of this cycle is the increasingly visible circularity of the money flows.

The pattern: chipmakers and cloud providers invest equity in AI startups. Those startups immediately spend the capital buying chips and cloud capacity from the same investors. The investors book the spending as revenue, which inflates their stock prices, which gives them more equity to deploy into the next round of AI startup funding. Some of the marquee deals:

• Nvidia → OpenAI: Up to $100 billion announced in September 2025, tied to deploying at least 10 gigawatts of Nvidia systems. OpenAI uses the capital to buy Nvidia chips.

• Oracle → OpenAI: A $300 billion, five-year cloud infrastructure deal — to be powered, of course, by Nvidia GPUs Oracle is buying.

• AMD → OpenAI: Roughly $90 billion in commitments, with AMD granting OpenAI warrants to buy AMD stock at nominal prices, which OpenAI exercises in part by buying AMD chips.

• Nvidia → CoreWeave: Nvidia holds roughly 7% of CoreWeave and signed a $6.3 billion agreement in September 2025 to buy CoreWeave’s unsold data center capacity through 2032.

• Amazon and Google → Anthropic: A combined ~$15 billion in commitments; Anthropic spends heavily on AWS and Google Cloud compute.

• Nvidia + xAI: Part of a consortium that bought Aligned Data Centers for $40 billion in 2025, with Nvidia striking a $20 billion lease-to-own deal with xAI for chips, partially funded through a special-purpose vehicle in which Nvidia itself has an investment.

By one tally, more than $800 billion in such circular financing arrangements are now stacked up across the AI ecosystem.

The defenders of these arrangements — and there are credible ones — argue this is vendor financing, not the round-tripping of the dot-com era. Vendor financing is legitimate: chipmakers help finance customers because chips are scarce and lock-in is valuable. Janus Henderson has called the web a "virtuous circle" matching long-term supply with long-term demand.

The skeptics see something more familiar. Paulo Carvao, a senior fellow at the Harvard Kennedy School, drew the parallel directly: "In the late 1990s, circular deals were often centered on advertising and cross-selling between startups, where companies bought each other’s services to inflate perceived growth." Michael Burry, the investor who famously shorted the 2008 housing market, now publicly shorts Nvidia and Palantir, writing on X: "True end demand is ridiculously small. Almost all customers are funded by their dealers." He later asked the obvious question: who audits OpenAI?

Nobel laureate Daron Acemoglu of MIT was blunter: "The danger is that these kinds of deals eventually reveal a house of cards."

The fragility is real. In early February 2026, a Wall Street Journal report that Nvidia’s $100 billion OpenAI investment had "stalled" — and that Nvidia executives had privately raised concerns about OpenAI’s "lack of financial discipline" — caused a brief but sharp sell-off across the entire connected ecosystem, including Oracle. Both companies issued public denials within hours, but the episode demonstrated something important: the market understands these companies are now linked in ways that mean a single broken negotiation can cascade.

Part 6: The Funding Treadmill Cannot Slow Down

Look at the fundraising calendar of the past nine months and the magnitude of what is being asked of capital markets becomes clear.

• OpenAI closed a record-shattering $122 billion round at an $852 billion post-money valuation in March 2026, anchored by SoftBank ($30 billion in three tranches), Nvidia ($30 billion), and Amazon (up to $50 billion, mostly conditional). The company extended participation to retail investors through bank channels for the first time, raising $3 billion from individuals.

• Anthropic closed a $30 billion Series G at a $380 billion post-money valuation in February 2026 — and was, by April, weighing preemptive offers for another $40–$50 billion round at a valuation between $850 and $900 billion. That second raise, if completed at the upper end, would mean Anthropic’s valuation more than doubled in three months.

• xAI secured $6 billion in Q1 2026 despite having lost all 11 of its co-founders.

• CoreWeave, Nebius, Nscale, Lambda, and other neoclouds have raised billions more, much of it directly funded by Nvidia.

Total Q1 2026 AI funding exceeded $180 billion — more than all of 2024 combined.

These numbers are staggering, but the more important framing is what they mean operationally: these companies cannot stop raising. OpenAI’s projected $14 billion loss in 2026 alone, against ~$1 trillion in committed infrastructure obligations, means the company must raise tens of billions of dollars roughly every twelve to eighteen months for the foreseeable future. Anthropic’s burn is proportionally similar. The trajectory mathematically requires either uninterrupted access to capital markets at ever-higher valuations, or step-function leaps in revenue that — even at the current breathtaking growth rates — are not yet sufficient to fund the buildout from operations.

Sam Altman told Fortune, days after closing the largest private funding round in history, that a twelve-month delay in AI progress would make him bankrupt. Dario Amodei has used similar language. These are not throwaway lines. They are accurate descriptions of a business model in which a single bad quarter — a model that doesn’t ship, a key customer that defects, a regulator that intervenes, a capital market that closes — could break the entire chain.

Part 7: How This Likely Ends — Three Scenarios

Predicting bubbles is a fool’s errand. Predicting vulnerabilities is more tractable. Three scenarios are now actively debated in serious investor circles:

Scenario 1: Soft Landing

AI revenue growth continues to compound at 2–3x annually. Enterprise deployments scale. Inference costs continue their ~10x-per-year decline. Hyperscaler capex eventually plateaus and free cash flow recovers in 2028–2029. Some neoclouds and second-tier model labs fail or get acquired, but the largest players grow into their valuations. Investors who held through volatility are richly rewarded.

This is the bull case, and it is non-trivially probable. Federal Reserve Chair Jerome Powell, JPMorgan, and Jefferies analyst Brent Thill — who memorably told the Financial Times that "the bear thesis is garbage" — have all argued AI does not meet the classic bubble criteria because the revenue is real, the customers are paying, and the productivity gains are coming.

Scenario 2: Bumpy Correction

Enterprise AI ROI continues to disappoint at the level of 75% of deployments. Token consumption growth slows as customers implement governance. Hyperscaler returns on AI capex come in materially below cost of capital for two or three years. Stock multiples compress 30–50%. Several neoclouds fail; one or two model labs are forced into distressed sales. Capital markets remain open but at lower valuations, and the industry consolidates around four or five genuine winners. This is the modal outcome predicted by sober analysts and is closest to the dot-com analog.

Scenario 3: Loop Breaks

Enterprise ROI disappointment, an OpenAI cash crisis, a regulatory shock (antitrust, AI safety, export controls), or a credit market freeze causes capital to stop flowing into the sector. Circular revenue evaporates. Hyperscalers write down tens of billions in stranded data center assets — much of which would, like dot-com era fiber, sit underutilized for years. Nvidia’s growth narrative breaks. The connected web of equity investments unwinds rapidly. Sovereign and pension fund exposure, much of it accumulated through SoftBank, GIC, MGX, and similar vehicles in 2025–2026, takes meaningful losses.

The asymmetry to note: Scenarios 2 and 3 do not require AI to be useless, fraudulent, or a bad technology. They require only that the pace of monetization fail to keep up with the pace of capital deployment. Given that capital deployment is currently growing at ~70% per year and enterprise AI revenue is growing at perhaps half that, the gap is widening, not closing.

Conclusion: The Wall is Closer Than It Looks

The token-driven cloud AI industry is not, by most reasonable measures, a fraud. The technology is real. The customers are real. The growth is real. Anthropic’s leap from $1 billion to $30 billion in run-rate revenue inside fifteen months is, in absolute terms, the fastest scaling event in enterprise software history.

But "real" and "rationally priced" are different claims. Anthropic at a possible $900 billion valuation prices the company at roughly 30 times current annualized revenue, for a business with negative free cash flow, structural margin pressure from token economics, and a near-total dependence on continued, exponential capital availability. OpenAI at $852 billion and roughly $25 billion in run-rate is comparable. Hyperscalers are levering up their balance sheets, mortgaging future cash flows, and entering circular contracts with their largest customers — the exact pattern that emerged in fiber optics in 1999 and in subprime housing in 2006.

What history teaches is that bubbles typically burst not because the underlying technology fails but because the financing structures supporting the buildout become unstable faster than the technology can monetize. Fiber optic networks in 2001 were not bad infrastructure; they were prematurely capitalized infrastructure. Most of those cables eventually got used. But the equity that funded them was wiped out, the bond markets froze, and the rebuild took the better part of a decade.

The current AI buildout faces the same risk in vivid form. Capital deployment at ~$725 billion per year. End-customer AI revenue at perhaps $50–100 billion. Anchor customers losing tens of billions per year and committed to buying trillions of dollars of compute. Vendor financing replacing customer financing across the stack. Power, memory, and packaging constraints that no amount of capital can quickly resolve. And valuations that price in not just the next two years of growth but the next decade of it, with no margin of safety.

The wall is not the technology. The wall is the math.

When investors eventually demand that the math close — that revenue catch up to capex, that gross margins stabilize, that the circular flows resolve into genuine, third-party demand — the AI cloud industry will discover, as fiber optics discovered, that the ground beneath its feet was never quite as solid as the headlines suggested.

The only real question is whether that discovery happens in 2026, 2027, or 2028. The walls themselves are already in view.

Research current as of May 2026. Figures cited from public earnings releases, analyst reports (CreditSights, Morgan Stanley, Bank of America, Pivotal Research, Barclays, Mizuho, Citi, Bloomberg, CNBC, Financial Times, Wall Street Journal, Reuters, Sacra, SaaStr, Epoch AI, a16z, Forrester, Gartner), and primary statements from company executives. Michael Burry quote sourced from his post on X, November 2025. Daron Acemoglu (MIT Institute Professor, 2024 Nobel laureate in Economic Sciences) quote sourced from NPR interview with Bobby Allyn, "Here's why concerns about an AI bubble are bigger than ever," November 23, 2025.