There's a conversation happening in CFO offices right now that wasn't happening two years ago. It goes something like this: "Wait, we spent how much on AI last quarter?"
The sticker shock is real. And it's only going to get worse.
When organizations first adopted cloud AI tools, the per-token pricing model felt inconsequential. A fraction of a cent per query. Pocket change. But that math only works at pilot scale. The moment AI becomes genuinely embedded in your workflows — the moment it stops being a novelty and starts being infrastructure — the bill transforms into something that looks a lot like a second AWS invoice. Except unlike AWS, you get no Reserved Instances, no Savings Plans, and no hardware to show for it.
Cloud AI Has a Fundamental Pricing Problem
Every major cloud AI provider — OpenAI, Google, Anthropic, Microsoft — bills by the token. A token is roughly three-quarters of a word. Every word your team reads, writes, or generates through AI is being counted, metered, and charged. The more valuable AI becomes to your organization, the more you pay. Growth in AI adoption translates directly into growth in your bill.
This is not accidental. It is the business model. Cloud vendors have every incentive to maximize your token consumption — more features, more integrations, more nudges to use AI for more tasks — because each of those generates more revenue for them. Your success is their upside.
The result is a cost structure that is fundamentally unpredictable. A team that starts experimenting with AI-assisted document review might generate ten times more tokens six months later once they've embedded it into their daily workflow. The finance team cannot budget for it accurately, because nobody knows how much the team will use it until they do. The vendor's incentive is for you to use more. Your incentive is to control costs. Those two things are in direct tension.
"Cloud AI's business model is built on metering your consumption. The more valuable AI becomes to your organization, the more you pay. Your success is their upside."
Gridlight Works the Other Way Around
Gridlight does not meter tokens. It meters performance capacity — measured in TFLOPs, the compute power of the hardware running the models. You set your allocation once, based on the hardware you have or plan to deploy, and that number determines your annual cost. Full stop.
Within that allocation, tokens are unlimited. Your team can run one query a day or ten thousand. They can generate short answers or long reports. They can experiment freely, build new workflows, onboard new employees to AI-assisted tools — none of it changes what you pay. The cost is fixed the moment you know your hardware.
This matters in two ways that compound each other:
Predictability: Your hardware specs are known quantities. A server with a given GPU delivers a known number of TFLOPs. That translates directly to a fixed annual cost that your finance team can put in a budget and never revisit. There are no overages, no surprise invoices, no line items that grew because a department started using AI more than expected.
Unlimited consumption: Because cost doesn't scale with usage, there is no internal pressure to ration AI access. Organizations on cloud AI quietly self-limit — teams hesitate to run large queries, managers wonder whether the AI spend is justified, and the tool gets used less than it should. On Gridlight, that friction disappears entirely. The cost is already paid. Use it.
"When cost doesn't scale with usage, there's no reason to ration AI access. The cost is already paid. Use it as much as you want."
A direct cost comparison between cloud AI and Gridlight is not perfectly apples-to-apples — because cloud AI caps your spending by capping your usage, while Gridlight caps your spending regardless of usage. But the comparison is still worth making, because it shows what organizations are paying for a constrained, metered experience versus what Gridlight charges for an unlimited one:
Tier | Cloud AI / Year (metered tokens, variable) *1-2 | Gridlight / Year (unlimited tokens, fixed) | Annual Savings |
Team (100 TFLOPs) | ~$9,000 + scales with every query | $2,280 Tokens: unlimited | ~$6,700/yr (75% less) |
Mid-size (500 TFLOPs) | ~$36,000 + scales with every query | $11,400 Tokens: unlimited | ~$24,600/yr (68% less) |
Enterprise (2,000 TFLOPs) | ~$90,000 + scales with every query | $45,600 Tokens: unlimited | ~$44,400/yr (49% less) |
* Assumptions for cloud numbers:
3M tokens/employee/month — I pulled this from general industry estimates for "active" AI usage. This could be way off depending on use case. A developer using AI for code generation burns far more. An executive using it for email summaries burns far less.
$5/1M tokens blended — I used a 70/30 input/output split on GPT-4o enterprise pricing (~$2.50 input / $10 output per 1M tokens), which blends to roughly $5. But OpenAI, Google, and Anthropic all have different rates, and enterprise contracts often have negotiated discounts.
The cloud figures above reflect estimated annual spend at representative usage levels — and they are conservative. They assume moderate, not heavy, AI adoption. They do not account for the price increases that have already happened or the ones that are coming. And critically, they represent a ceiling that moves upward every time your team uses AI more.
The Gridlight figures are ceilings that don't move. Ever.
The Cost Is Only Part of the Problem
There is a subtler problem with per-token cloud AI that goes beyond the invoice. The vendor knows your usage patterns better than you do. They can see which organizations are most dependent on their platform, which features drive the deepest lock-in, and exactly where price increases will be absorbed rather than churned. Per-token pricing with a captive enterprise customer base is one of the most favorable pricing structures in software. You are on the wrong side of that equation.
When a provider deprecates a model — which they do — your workflows break until you migrate. When they change their data retention policies — which they have — your legal team scrambles. When their servers go down — which they do — your AI-dependent processes stop. You have no leverage. You are a tenant, not an owner.
The Trend Is Not Your Friend
AI model pricing has not followed the trajectory that general cloud compute pricing did. GPU scarcity, enormous training costs, and investor return expectations are keeping prices elevated even as the underlying technology matures. The organizations that locked into cloud AI as their primary infrastructure are discovering that they have built a dependency with no natural exit ramp.
The organizations that deploy on-premises AI today are buying a different future: a fixed, predictable cost, unlimited token usage across their entire team, data that never leaves their building, and a relationship with AI infrastructure that looks like ownership rather than a subscription they can never cancel.
"The question isn't whether to make the switch. It's whether you wait until the next invoice to start thinking about it."
Gridlight: one fixed cost, unlimited tokens, hardware you own. No per-token billing. No surprise invoices. No data leaving your building.
