The Broken Economics of AI

Why the current economics of artificial intelligence cannot hold and what enterprise leaders should do before the real price kicks in.

May 20, 2026

In my discussions with leaders across Asia, Middle-East and Europe, I hear a lot of versions of the same question: where can I / should I use AI across my business?

Most leaders talk about “use cases” for AI. Few conversations get into the business case and true expected ROI. Worse, there is no deliberate alignment with the overall business strategy and success metrics; including people’s incentives.

That often results in fragmented initiatives across the business that fail to deliver meaningful impact at organisation scale.

The article below goes deep on AI economics as a primer to understand the mechanics behind AI pricing today and what will change and why.

The implications for enterprise leaders are crucial: as AI models are being wired within automated workflows, business processes re-engineered around AI agents, decision making augmented with intelligence; they become a new dependency risk. And tokens become a cost line at par with payroll and IT spend.

What makes up a token price today?

Why will it change? and by how much?

How should enterprise leaders factor this in now?

The most expensive technology cycle in modern history is being priced as if it were free.

Every credible reading of the numbers (inference costs, infrastructure capex, energy build-out, regulatory exposure) points to the same conclusion: today’s AI economics are subsidised, distorted, and structurally unsustainable. Enterprises that have built their AI strategy on the assumption that current pricing reflects current cost are building on sand.

This is not a prediction. It’s already visible in the market.

OpenAI spent $8.67 billion on inference in the first nine months of 2025, roughly double its revenue for the same period. Anthropic, on internal estimates reported by The Information, posted gross margins of -94% to -109% on inference in 2024. GitHub Copilot reportedly lost more than $20 per user per month at a $10/month price point. Sam Altman has publicly acknowledged that OpenAI loses money even on its $200/month ChatGPT Pro tier.

These are the financials of a market in capital-funded land-grab to secure market share. It is hard to ignore the precedent: ride hailing platforms have done this before. But more on that later.

In December 2025, Anthropic CEO Dario Amodei told the DealBook Summit: “There are some players who are YOLO, and I’m very concerned.” In April 2026, GitHub announced that all Copilot plans would transition to usage-based billing on June 1, 2026, citing the simple fact that the existing request-capped model “is no longer sustainable.” Cursor moved to credit pools in mid-2025 and had to issue refunds. Anthropic and OpenAI followed.

The era of the all-you-can-eat AI subscription is closing. The question is not whether enterprises will pay more: they will.

The question is who will get caught by surprise and how it will impact their organisations and teams.

This piece maps the four forces that make the current economics impossible to sustain, and translates them into concrete implications for enterprise AI strategy.

Force One: The Capex Base Is Built on Constrained Resources

The AI infrastructure build-out is being priced as if energy, minerals, and grid capacity were elastic. They are not.

I have written about this in detail with Xavier Greco at ENSSO in The AI-Energy Paradox. The headline numbers bear repeating: data centres are forecast to consume over 1,000 TWh of electricity by 2026, roughly double their 2022 level. Gartner projects that 40% of existing AI data centres will hit power capacity limits by 2027. Microsoft and OpenAI’s Stargate supercomputer alone is sized at 5 GW — the output of a large power plant.

That demand is now colliding with a tightening energy backdrop. Since the February 2026 outbreak of the US-Israeli war with Iran, Brent crude has spent most of Q2 above $100 per barrel, peaking near $120 amid disrupted flows through the Strait of Hormuz. The IEA has called the closure the “largest supply disruption in the history of the global oil market.” Whatever the eventual political resolution, energy is no longer cheap, and the data-centre fuel mix, natural gas peakers, diesel backup, expanded grid, is exposed.

The minerals story is worse, and more durable. China processes approximately 90% of the world’s rare earths and around 91% of the elements required for permanent magnets. After Beijing’s October 2025 export-control escalation, European rare earth prices reached up to six times those inside China. A twelve-month suspension expires on 10 November 2026, and Bloomberg Intelligence projects a 36% global NdPr shortfall by 2030 even in the optimistic scenario. The materials that go into GPUs, motors, and grid infrastructure run through a single jurisdiction.

Meanwhile, the dominant counterweight in clean generation (solar manufacturing, battery storage, transmission equipment) is also concentrated in China. The build-out is dependent on the same actor it is meant to compete against.

This is the dependency landscape I mapped across 25 national AI strategies in The Dependency Economy of AI. NVIDIA holds 94% of the discrete GPU market. TSMC produces over 90% of leading-edge chips. ASML controls nearly 100% of EUV lithography. Layered onto that: water consumption, energy, rare earths.

The point is this: the cost base for AI is exposed to oil prices, to one country’s mineral processing decisions, and to grid-capacity ceilings that did not exist when current AI pricing was set. Any model assuming a smooth glide path of falling unit costs is ignoring the inputs.

Force Two: The Pricing Model Is Wrong for the Product

The current dominant AI pricing model (flat-rate subscription with capped requests) was inherited from SaaS. It does not fit a token-metered, inference-bound product.

Five problems compound:

1. Usage is wildly variable. Two users on the same Copilot plan can produce a ten- or hundredfold difference in token consumption. There is no “average user” to price against in the way SaaS feature economics assumed.

2. Models burn more tokens, faster than prices fall. Reasoning models, agentic loops, and longer context windows have systematically increased token-per-task counts. A multi-agent flow doing twelve tool-augmented turns at ~80k tokens per turn easily crosses 1–2 million tokens per task, as I detailed in Agentic AI: Busting the Myth of Autonomous Intelligence. Unit token costs are falling, but task-level token consumption is rising faster.

3. Users do not control consumption. Token spend happens inside the model, invisible to the user, often invisible to the buyer. This is the inverse of every other utility relationship in enterprise software, where the buyer can at least see the meter.

4. More tokens does not mean better output. This is the part the industry does not talk much about. A May 2025 paper from Microsoft Research and Salesforce, LLMs Get Lost in Multi-Turn Conversation, tested every major closed- and open-weight model across six tasks. The finding: performance drops by an average of 39% in multi-turn conversations versus single-turn prompts. The authors are precise about the mechanism: “when LLMs take a wrong turn in a conversation, they get lost and do not recover.” The implication is brutal: for the same task, longer conversations cost more and deliver worse results. A token-metered model in which performance degrades with consumption is a pricing design mistake customers will notice immediately, and resist.

A token-metered model in which performance degrades with consumption is a pricing design mistake customers will notice immediately, and resist.

5. The subsidies are enormous. Industry analysts estimate API pricing may need to rise 3–10x to reach sustainable economics. Some estimates put OpenAI’s loss at roughly $1.35 for every dollar earned. Whatever the exact number, the subsidy is not marketing spend, it is structural cost being absorbed by venture capital, hyperscaler balance sheets, and circular financing arrangements between NVIDIA, OpenAI, and Oracle that I examined in Who Is Really Winning the AI Leadership Race?.

The market is now self-correcting. GitHub’s June 2026 transition to usage-based billing is the most visible signal. Every paid plan will include a fixed monthly AI Credit allotment matched to its price; overages will be charged at published API rates. The blog post is unusually direct: “GitHub has absorbed much of the escalating inference cost behind that usage, but the current premium request model is no longer sustainable.”IDC has forecast that Global 1,000 companies will underestimate their AI infrastructure costs by 30% through 2027. Token-metered billing will widen that gap, not close it.

We Have Seen This Movie Before: Ride-Hailing

It is worth noting that the ride-hailing industry ran the same playbook a decade ago.

Uber spent its first decade losing money at scale to build market share, accumulating more than $33 billion in losses before its first annual profit in 2024. Below-cost rides were a venture-financed customer acquisition subsidy designed to kill competition. By the time it raised prices, there was no one left to undercut it.

Then the subsidies ended. Between 2018 and 2021, average Uber prices rose 92%. Uber’s take rate climbed from 20% to roughly 35–40% by 2023. In February 2024, Uber announced its first-ever annual profit and a $7 billion stock buyback.

The Southeast Asia version is starker. After Uber sold its SEA operations to Grab in 2018 for a 27.5% equity stake, Grab consolidated to roughly 70% of the regional mobility market. Fare complaints started immediately. INSEAD’s post-deal analysis named SoftBank (significant stakes in both Uber and Grab) as the real winner. The same pattern played out in China when Didi absorbed Uber in 2016: fares rose roughly 20%.

The parallel to AI is obvious. ChatGPT Pro at $200/month and Copilot at $10/month are the equivalent of a $5 Uber ride from your office to the airport in 2014: priced below cost, designed to lock users in before competitors establish themselves. GitHub’s June 2026 move to usage-based billing is the AI equivalent of Uber’s post-IPO fare increases. It will not be the last.

Note on pricing structure: Uber and Grab do not charge a flat fee. They charge a demand-adjusted price per kilometre, which links price to cost. Even the precedent industry that ran the same playbook never tried to price a variable-cost service at a flat rate. Flat-rate AI subscriptions are a structural mistake the ride-hailing platforms knew to avoid from day one. AI giants are now catching up.

Two patterns are worth importing into AI strategy. Post-consolidation pricing power is not bounded by cost: it is bounded by what locked-in customers will tolerate before switching. Uber’s 92% price increase in three years was not driven by 92% input cost inflation. It was driven by reduced competition. And the price hike will not arrive as a single announcement. It will arrive as usage-based billing replacing flat rates, tier restructuring, deprecation of cheaper models, new “premium” model gates, surcharges on long-context use, and tighter rate limits on the lower tiers.

Force Three: Two AI Strategies, Two Different Economics

The two AI giants are playing a different game. It’s worth analyzing to understand the implications. As I argued in Who Is Really Winning the AI Leadership Race?, the US and Chinese AI ecosystems are not running the same race.

The US model is capital-intensive, performance-maximalist, closed-weight, and locked behind paid API. Its theory of victory is frontier capability and platform lock-in. Its economics rely on three subsidies: cheap capital (now tightening), circular financing across NVIDIA, OpenAI, and Oracle (now under scrutiny), and below-cost inference pricing (now ending).

The Chinese model is efficiency-first, open-weight, federated, and oriented to adoption. DeepSeek and Qwen demonstrate 30–50% compute reduction through architectural innovation (FP8 training, DualPipe parallelism, Multi-Head Latent Attention) while maintaining performance. China has more than 100 approved LLMs. The strategic objective is not to win the benchmark race but to deploy at scale into manufacturing, logistics, and public infrastructure, where actual economic productivity gets measured.

Two implications follow:

Open-weight, efficient models will exert downward price pressure on US closed APIs. Enterprises with the engineering capability to run Qwen, DeepSeek, Mistral or Llama variants on their own infrastructure already have a hedge.
The competitive ceiling for US labs is set in part by what cheap, open, distributed alternatives can do “well enough.”When the gap narrows, premium API pricing has to justify itself on workflow integration and reliability, not raw benchmark performance.

This is also what makes the multi-turn collapse paper geopolitically interesting. If reliability degrades across all frontier models in real-world conversational use, the gap between “frontier” and “good enough” narrows further. The argument for paying frontier prices weakens.

Force Four: The Regulatory Bill Is Not Yet on the Invoice

Most current AI business cases do not price regulatory exposure, because the regulatory environment is still forming. That is precisely why it is dangerous.

The shape of what is coming is already visible. The EU AI Act began phased implementation in August 2025. China’s generative AI regulations require government review before deployment. The US has shifted to a deregulation-plus-export-controls posture under the Trump II administration. India, Brazil, the UAE, Singapore, Japan, and the UK each have distinct national AI frameworks, and they do not converge.

There is no single architecture blueprint. There is no single governance standard. There is no mutual recognition framework on AI conformity assessment. Enterprises operating across jurisdictions will face overlapping, sometimes contradictory obligations on model approval, data residency, transparency reporting, energy disclosure, and content provenance.

This fragmentation has a cost. Every jurisdiction-specific compliance regime requires architecture rework: regional model variants, segmented data flows, jurisdiction-aware routing, audit trails. The Microsoft–Nayara Energy episode of July 2025, where Microsoft unilaterally suspended Outlook and Teams access for an Indian refinery handling 8% of national capacity, citing its interpretation of EU sanctions, showed how a single jurisdictional decision at the vendor level can shut down operational continuity. That risk is now extending from cloud productivity into AI inference.

For monitoring the regulatory trajectory in real time, the TEKHORA platform tracks AI governance events and posture shifts across jurisdictions 70+, a discipline that should sit alongside cyber and supply-chain risk monitoring at board level (full disclosure: Tekhora was created by my company).

The cost of regulatory fragmentation will not be absorbed by vendors. It will land on enterprises, in the form of architectural rework, compliance overhead, and slower deployment cycles. It will also reprice ROI on existing AI initiatives that were not designed with jurisdictional optionality in mind.

What This Means for Enterprise AI Strategy

The AI economics that underpin most current enterprise AI strategies will not hold for the duration of those strategies’ planning horizons.

Boards that have approved AI initiatives on the basis of today’s pricing should run the model again at 3x and 5x. And those numbers should be in the next quarterly review.

Repricing is not the only risk. Enterprises that have already moved too fast on AI deployment, particularly those that re-engineered operating models on the assumption that current AI capability would hold, have learnt it the hard way.

Klarna is the most visible example. Between 2022 and 2024, the Swedish fintech eliminated approximately 700 customer service positions and replaced them with an OpenAI-built assistant, with the CEO publicly declaring that “AI can already do all of the jobs that we, as humans, do.” By May 2025, the company reversed course, with CEO Sebastian Siemiatkowski admitting: “As cost unfortunately seems to have been a too predominant evaluation factor when organizing this, what you end up having is lower quality.” Customer satisfaction had dropped. Operational issues had accumulated. The company began rehiring, though notably as freelance gig workers, not full-time employees, repeating the ride-hailing labour pattern in a different industry.

Klarna is not an outlier. MIT Media Lab’s State of AI in Business 2025 report, based on 150 leadership interviews, 350 employee surveys, and 300 public AI deployments, found that 95% of enterprise AI pilots delivered zero measurable P&L impact despite $30–40 billion in enterprise investment since 2023. Only 5% reached production with measurable value. The gap, the MIT researchers found, comes down to organisational learning and workflow integration, not model quality. Gartner has separately predicted that by 2027, half of companies that cut customer service staff because of AI will need to rehire.

The lesson is not that AI deployment fails. The lesson is that hasty operating model redesign, particularly when it involves irreversible decisions like layoffs or process re-engineering, might run ahead of what current AI can actually deliver. The companies racing to declare themselves “agentic-first” or “AI-native” without working through their absorption capacity are setting themselves up to either reverse publicly (Klarna), absorb quality degradation (most), or rebuild under maximum competitive pressure once the economics reprice.

Seven concrete implications:

1. Build a model-agnostic architecture. Every critical workflow needs a hot-swappable fallback across at least three classes of models: frontier closed, open-weight large, and small/specialised. It is the only architecture that survives provider price changes, deprecation timelines, regulatory restrictions, or jurisdictional cutoffs. I detailed the full pattern, including Model Bills of Materials and Vendor Resilience Audits, in The Dependency Economy of AI.

2. Right-size models to tasks. A billion-parameter frontier model is the wrong tool for most enterprise use cases and the most expensive one. Finding the Right AI Model for the Job sets out the matching logic. Routing logic that pushes simple work to small, cheap, often open models and reserves frontier capacity for genuinely hard tasks typically cuts inference cost by an order of magnitude with no measurable quality loss.

3. Build internal capability. Reduce dependencies. Stop outsourcing your AI brain.This is the cognitive autonomy point. If your “intelligent” workflow runs entirely on a third-party model whose architecture, training data, governance logic, and version timeline you cannot see, you have lost your autonomy. Enterprises need internal engineers capable of model evaluation, fine-tuning, and infrastructure independence, plus a layer of external flex capacity on top. The objective is not full insourcing. It is the ability to switch when conditions change.

4. Protect from cognitive atrophy at the workforce level. A February 2025 study by Microsoft Research and Carnegie Mellon surveyed 319 knowledge workers and found that higher confidence in generative AI was associated with reduced critical thinking effort; the higher the trust in the tool, the less independent reasoning the user applied. The MIT Media Lab’s Your Brain on ChatGPT study used EEG to monitor 54 participants across essay-writing sessions: the LLM group showed the weakest neural connectivity, underperformed at neural, linguistic, and behavioural levels, and accumulated what the authors term “cognitive debt”: a measurable degradation of independent cognitive engagement that persisted even after the tool was removed. The pattern is visible in the field: Software engineers are publicly reporting that they have forgotten how to write code they once knew fluently, with one telling 404 Media: “It’s making me dumber for sure. It’s like when we got cellphones and stopped remembering phone numbers, but it’s grown to me mentally outsourcing ‘thinking’ in general.” When teams outsource judgement, drafting, and architectural reasoning to AI, the underlying capabilities erode and the literature now suggests they may not return easily. The institutions that come out of this cycle strongest will be the ones that used AI to augment thinking, not replace it. This is a leadership choice, not a technology choice.

5. Sequence operating model redesign carefully and reversibly. AI capability changes faster than organisational structures can be rebuilt. Operating model redesigns that depend on a specific level of AI capability (eg. replacing entire customer service functions, restructuring sales operations around agentic workflows, eliminating layers of middle management on the assumption that AI will absorb the coordination work) assume that capability is stable. It is not. Klarna's reversal is the publicly visible version of a much broader pattern. Pilot, measure, augment, then redesign, in that order. Avoid headcount decisions that cannot be undone within a quarter. Treat agentic and AI-first operating model claims as hypotheses to test against P&L.

6. Price regulatory exposure into every AI business case now.Add a regulatory contingency line to every AI investment thesis. Map jurisdictional exposure. Stress-test against the EU AI Act, China’s generative AI rules, US export controls, and at least one emerging-market framework relevant to your operations. Build modular architectures that can swap models across regions. The 10 November 2026 expiry of China’s rare-earth export-control suspension and the AI Act enforcement deadline for high risk use cases are some of important dates that should be on enterprise risk registers.

7. Plan for the re-pricing moment. API prices are unlikely to step up smoothly. They will move in discontinuous jumps tied to specific events: a vendor’s earnings call, a regulatory deadline, a geopolitical incident, an investor revolt. Enterprises that have done the work in advance will reprice their AI stack with continuity; as the price change will not come at once but in sequences (eg. Uber, Grab). Those that have not will be doing it under maximum competitive and operational pressure.

The Real Asymmetry

The companies that internalise this analysis now will look, in three years, like they made the boring choice: diversifying vendors, building internal capability, paying a bit more for resilience, slowing deployment to match governance maturity. That boring choice is the asymmetric one.

The competitors who took the subsidised AI economics at face value, who outsourced cognitive capability to a single API, and who ran AI strategies that assumed today’s prices and today’s regulatory vacuum would persist; those companies will be reworking their AI initiatives under the worst possible conditions. They will be doing it while paying more, while still trying to ship product, and while their regulators ask harder questions.

The mathematics of AI is not yet shown on the invoice... But the real bill will arrive!

Thanks for reading,

Damien Kopp is Founder & MD of RebootUp and publisher of KoncentriK. His work on AI sovereignty, technopolitics, and enterprise resilience is read by board members, C-suite executives, and policy audiences across Asia, Europe, and the Middle East. Companion reading: The Dependency Economy of AI, Who Really Controls Your AI?, The AI-Energy Paradox, Agentic AI: Busting the Myth of Autonomous Intelligence, and Who Is Really Winning the AI Leadership Race?.

KoncentriK | Tactics for Navigating Tech & Power

Discussion about this post

Ready for more?