Who Is Really Winning the AI Leadership Race?
Why the structural economics matter more than the next model benchmark and how enterprise leaders should choose models in 2026.
I am regularly asked by clients which model to deploy, when, and how to keep up with the cadence of frontier-lab announcements; each one claiming to beat the last on whichever benchmark suits the press release. The question underneath is always the same: who is actually winning the AI race, and does it matter for the model decision they are about to sign off on.
Since DeepSeek-R1 broke the cost frontier in January 2025, the competition has tightened. Public accusations have followed: industrial-scale data extraction, model distillation, jailbreak campaigns at proxy-account scale.
Then, on 23 April 2026, Michael Kratsios, Director of the White House Office of Science and Technology Policy, issued a memorandum accusing foreign entities, “principally based in China”, of running industrial-scale distillation campaigns against US frontier AI models. The document is unusually direct in its accusations — and silent on the economics behind them.
So I dug in.
As of today, it seems like the US AI leadership is treated, in policy and in capital markets, as settled. But it is not.
US AI Economics
Three subsidies hold the US leadership position in place:
The inference subsidy.
Anthropic reached $30 billion in annualised revenue by March 2026 against approximately $64 billion raised, with a projected $14 billion loss in 2026 and positive free cash flow not expected before 2027. OpenAI hit $25 billion in annualised revenue in February 2026, with cumulative losses of $44 billion projected through 2028 per internal financial documents reported by The Information. Microsoft is moving GitHub Copilot to token-based billing because its weekly cost of running the product has nearly doubled since January 2026. Anthropic itself shifted enterprise customers to per-token billing in April 2026. Uber blew through its full-year AI budget within months.
Customers are not paying the real cost of using these models. Investors are.
The demand subsidy.
NVIDIA has committed up to $100 billion to OpenAI. OpenAI has committed $300 billion to Oracle. Oracle is committing $40 billion to NVIDIA chips. Goldman Sachs analysis indicates 75% of OpenAI’s operating costs are covered by external funding, against 17% from revenue. NVIDIA claims $1 trillion in sales visibility through 2027 while only $285 billion of GPU-bearing data centre capacity is under construction. The chips have buyers; the buildings to put them in do not yet exist.
The same dollars are circulating between three companies and counted as revenue at each stop. Real end-customer demand is a fraction of the headline number.
The talent subsidy.
Within US AI institutions, 38% of top-tier researchers are of Chinese origin against 37% American, per the MacroPolo Global AI Talent Tracker. Six of the seventeen named contributors to GPT-4o trained at Tsinghua, Peking, Shanghai Jiao Tong or USTC. China now produces 47% of the world’s top-tier AI researchers, up from 29% in 2019.
The brains behind US AI leadership are largely Chinese-trained. Washington is now restricting the pipeline that supplies them (visa restrictions).
Then comes the question of adoption by end-users, especially enterprise users.
Enterprise Adoption
In October 2025, Brian Chesky confirmed Airbnb relies “heavily” on Alibaba’s Qwen for its customer service agent, noting OpenAI models are “more rarely used in production because there are faster and cheaper models.” In March 2026, Cursor — valued at $29.3 billion — disclosed that its Composer 2 coding model is built on Moonshot’s Kimi K2.5. Hugging Face CEO Clément Delangue’s response: Chinese open source is now “the most significant force shaping the global AI tech stack.” Alibaba alone counts more than 170,000 derivative models built on Qwen.
So by going open source, Chinese frontier labs have won a strong enterprise customer base. Because enterprise customers (1) do not like vendor lock-in, (2) need data sovereignty and local compute for security and compliance reasons, (3) need cost optimisation and predictability — especially at scale.
Closed US frontier models compete on capability; Chinese open-weight models compete on cost, latency, and the right to fine-tune. For the layer of the stack that actually generates revenue — the “applied layer”, where Cursor sits at $6 billion ARR and Airbnb runs millions of customer interactions — customers have chosen.
US enterprises are running production AI on Chinese open-weight models, not on US closed APIs.
Any policy response that hardens against Chinese open weights would force US enterprises back onto closed US APIs whose pricing and reliability are already the subject of enterprise complaint. It would deepen exactly the dependency it sets out to reduce.
Inside The Chinese Stack
The Capital Intensity
Today the US AI frontier is built on capital intensity. The Chinese AI stack runs on the opposite premise: efficiency under constraint. But enterprises don’t always need the best, most powerful model. They need the model that works for them and is economically viable.
DeepSeek-V4, released on 24 April 2026 — one day after the Kratsios memorandum — ships in two variants: V4-Pro at 1.6 trillion parameters (49 billion active) and V4-Flash at 284 billion (13 billion active), both with a one-million-token context window. V4-Pro was trained on 33 trillion tokens using FP4 quantisation for routed expert weights and FP8 for the rest. The architectural innovation is a hybrid attention system — Compressed Sparse Attention combined with Heavily Compressed Attention — that reduces single-token inference FLOPs to 27% of the previous generation at one million tokens, and KV cache to 10%.
The model is released under MIT License, scores 80.6% on SWE-Bench Verified (within 0.2 points of Claude Opus 4.6) and reaches a Codeforces rating of 3,206. Reuters confirmed on 4 April 2026 that the training run used Huawei Ascend 950PR chips, not NVIDIA hardware. NVIDIA CEO Jensen Huang called the outcome “horrible for the United States.”
This pattern is not specific to DeepSeek. Qwen, Kimi, MiniMax, and Ernie are all open-weight or open-source, all distributed without seat-based pricing, all designed to run on a wider range of hardware than the closed US models. Alibaba Cloud, Huawei Cloud, and Tencent Cloud provide the inference infrastructure domestically. The Chinese stack does not have an inference subsidy because it does not have an inference cost crisis. Even better: the success of their open models does not depend on their frontier labs’ ability to provide a massive infrastructure and energy consumption since the clients use their own.
With energy costs rising under combined pressure from Middle East instability and accelerating data centre build-out, the economics of self-hosted, compute-optimised models look increasingly favourable for enterprise workloads at scale.
The funding structure is also different.
Chinese AI labs operate inside a state-coordinated capital framework — long-horizon, low-cost, with no requirement to demonstrate quarterly revenue growth to a venture syndicate. They are not attempting to monetise tokens at a price the user will pay. They are attempting to maximise adoption, derivative work, and standard-setting reach. Alibaba’s 170,000 Qwen derivatives are not a side effect but part of the strategy.
The US frontier sells access to closed models at prices subsidised by venture capital. China distributes open-weight models at zero marginal cost, funded by long-horizon state capital. These are two different businesses competing for the same enterprise share of wallet.
The talent layer compounds this asymmetry.
What changed is the supply side. The Chinese AI workforce is now sustained domestically: 51% of top Chinese AI undergraduates pursue graduate studies in China, and 31% remain in China for work after graduation, per the MacroPolo Global AI Talent Tracker. Tsinghua and Peking are now ranked third and sixth globally for AI research output. Six Chinese institutions sit in the global top 25, against two in 2019. Zizheng Pan interned at NVIDIA and chose DeepSeek. Yao Shunyu, a Tsinghua physics special-prize winner, joined Anthropic in the US — but the direction of flow is no longer one-way.
The two stacks are converging on capability and diverging on economics. The US frontier needs subsidies to operate. The Chinese frontier does not.
The dominant US framing treats Chinese model performance as the product of extraction and theft of intellectual property. But we should note that the technical record is more nuanced. DeepSeek-V4’s architectural disclosures — Compressed Sparse Attention, Heavily Compressed Attention, manifold-constrained hyper-connections, FP4-quantised expert weights — are documented research contributions, published with reproducible methodology. DeepSeek’s earlier work on Multi-Head Latent Attention and DualPipe parallelism, set out in the V2 and V3 technical reports, is now cited in mainstream efficiency research. Some capability transfer through distillation is plausible. But attributing the bulk of the Chinese capability position to distillation requires ignoring a research track that has made specific, documented contributions to inference efficiency and post-training.
The foundational architecture of modern LLMs is overwhelmingly US-origin. But the recent efficiency frontier (running these architectures at lower cost) is increasingly Chinese.
The Regulatory Layer
Beyond the economics, what will shape AI adoption for enterprises is the regulatory posture of the various jurisdictions companies operate in.
And it’s not simple: countries have adopted voluntary frameworks and guidance (Singapore, India, UAE, Australia, …), sectoral regulations (US, UK, …), binding law (EU, Vietnam, China, Russia, …) and many have pending legislations (Canada, Brazil, Turkey, South Africa, Niger, …).
Last month, I launched the beta version of Tekhora AI Radar, a geointelligence platform that tracks more than two thousand regulatory events and over seven hundred expert signals across seventy jurisdictions.
The data shows that three patterns now dominate: divergence on what an AI model is allowed to do, divergence on where its data may flow, and divergence on who can be held accountable when it fails.
A model compliant in California may not be compliant in Frankfurt, in Singapore, or in Riyadh. Enterprises operating across more than one of those markets are not choosing between vendors but between regulatory regimes.
The EU AI Act, in force since August 2025, classifies obligations by risk tier and applies extraterritorially to any provider whose output reaches an EU user. China’s Generative AI Measures require pre-deployment review and content alignment for any model offered to Chinese users. The US position leans on export controls and informal guidance rather than statutory frameworks. India’s DPDP Act and forthcoming AI rules emphasise data residency. Singapore’s Model AI Governance Framework sets sectoral expectations without prescriptive rules. None of these are converging.
The regulatory layer is fragmenting faster than the technology layer is consolidating. An enterprise running production AI in three regions should now be running three different compliance stacks. Which means the model decision is no longer just about economics.
What It Means For Enterprise Leaders
The decision facing every enterprise leader running production AI is not which side to take in a US-China contest. It is which model architecture clears the unit economics of the workload it is meant to serve.
That is, in a way, a procurement question, not a political one. Treating it as political is how organisations end up locked into the wrong stack for the right-sounding reason.
Three risks now sit on the same line item as model performance.
Risk 1: token cost normalisation. Microsoft’s move on GitHub Copilot, Anthropic’s shift of enterprise customers to per-token billing, and the Goldman Sachs report on inference spend approaching 10% of engineering headcount are signals that the subsidy era is closing.
What to do: model any workload built on a closed US API at two to four times the current per-token rate, with rate limits tightening. CFOs should require AI cost forecasts to be stress-tested at API rack rate, not at subscription-blended rate.
Risk 2: demand-loop fragility. A meaningful share of the AI infrastructure capacity now under construction is committed to two customers (OpenAI and Anthropic) under contracts whose financing depends on continued external capital flows. If one node in the NVIDIA-OpenAI-Oracle circuit cannot finance its commitment, the chain unwinds.
What to do: treat long-term commitments to closed US APIs as exposure to vendor solvency, not just vendor pricing. This belongs on the same risk register as cloud-region failure.
Risk 3: research-base concentration. US AI capability is staffed substantially by Chinese-trained researchers operating under tightening visa conditions. The base case is gradual erosion of the recruitment advantage that built the lead.
What to do: track talent retention disclosures from your model providers. Capability roadmaps reflect the people building them.
The model decision is now a four-variable problem: capability, real token cost, vendor solvency, and roadmap continuity.
The defensive response is the one I have argued in my publication before: map and manage your dependencies. Specifically:
Three classes of model in production for any critical workflow, with hot-swappable fallbacks and documented migration paths
A Model Bill of Materials tracking upstream dependencies — foundation model, hosting region, jurisdiction, contract terms
Vendor resilience audits with explicit clauses on API continuity and version sunset
None of this is exotic. It is the standard operational practice for any tier-zero supplier dependency, applied to AI.
For non-US enterprises, the calculus changes by sovereignty archetype rather than by allegiance:
European firms operating under regulatory sovereignty already have governance scaffolding in place; the gap is operational — running production workloads on Mistral, on Chinese open weights, or on hybrid architectures that do not depend on US API continuity
ASEAN enterprises sit in the open-yet-local archetype, where Singapore’s SEA-LION already runs partly on Qwen
Indian enterprises, working at scale on imported foundation models, face a sharper version of the same question
For all three, the closed US API is no longer the safe default it was assumed to be.
It is not just a question of which flag is on the data centre. It is a question of which combination of model, architecture, and jurisdiction holds together for the workload at hand.
How to Decide
US AI leadership is real at the frontier. OpenAI, Anthropic, and Google operate the most capable closed models on the market. The architectural lineage — Transformer, RLHF, scaling laws, chain-of-thought — is overwhelmingly American. None of this is in dispute.
What is in dispute is whether the lead, as currently structured, is sustainable — and more importantly, whether capability at the frontier is the correct variable on which to base an enterprise model decision in 2026.
A leadership position that requires venture capital to absorb its operating costs, circular financing to inflate its demand signal, and foreign-trained talent to staff its research is not a leadership position in the conventional sense. It operates on borrowed time across three dimensions. Defending that position against external extraction does not address the internal arithmetic that makes it fragile.
The Monday-morning question for an enterprise leader is not whether to bet on the US or on China. The question is which model, at which unit cost, on which architecture, hosted in which jurisdiction, can serve a specific workload reliably for the next eighteen months — and what the fallback is if it cannot.
Choose by economics, build for optionality, map your full dependency chain, and treat any vendor — closed or open, US or Chinese — as a dependency that requires a fallback.
Leadership in AI is increasingly not about owning the best model. It is about absorbing the shocks that come with deploying any of them at scale.
The frontier is American. The economics are not yet settled. Choose the model that fits the workload, not the flag.
Thanks for reading,
Damien
Damien Kopp is Founder and Managing Director of RebootUp, an advisory practice that helps enterprises compress AI time-to-value while retaining sovereignty. He is the founder of Tekhora, an AI Operating Intelligence platform that maps regulatory and geopolitical exposure across the AI stack. He publishes KoncentriK, and is Associate Faculty at Singapore Management University. Contact: damien.kopp@rebootup.com
Citations and sources
Primary news sources
Kratsios memorandum, 23 April 2026 — https://whitehouse.gov/wp-content/uploads/2026/04/NSTM-4.pdf
Le Grand Continent commentary on the Kratsios memo — Victor Storchan, La note qui annonce une nouvelle phase dans la guerre de l’IA, 24 April 2026. (Source for “principally based in China” framing and applied-layer evidence.)
Anthropic and OpenAI economics
Anthropic revenue and burn projections — Sacra, Anthropic revenue, valuation & funding.https://sacra.com/c/anthropic/
OpenAI revenue and cumulative loss projections — Sacra, OpenAI revenue, valuation & funding.https://sacra.com/c/openai/
Anthropic shift to per-token billing, April 2026 — CNBC, AI demand is inflated, and only Anthropic is being realistic. https://www.cnbc.com/2026/04/17/ai-tokens-anthropic-openai-nvidia.html
Microsoft GitHub Copilot moves to token-based billing; Uber AI budget exhaustion — Ed Zitron, The Hater’s Guide To The SaaSpocalypse (March 2026) and Four Horsemen of the AIpocalypse (April 2026). https://www.wheresyoured.at/hatersguide-saas/
Goldman Sachs: 75% external funding for OpenAI 2026 needs — Goldman Sachs estimate (October 2025), via Wallstreetcn. https://longbridge.com/en/news/260251169
Goldman Sachs: inference spend approaching 10% of engineering headcount — The Stack, Inference budgets are breaking the bank. https://www.thestack.technology/inference-budgets-are-breaking-the-bank-what-now/
Circular financing
NVIDIA-OpenAI-Oracle deal structure — Bloomberg, AI Circular Deals: How Microsoft, OpenAI and Nvidia Keep Paying Each Other. https://www.bloomberg.com/graphics/2026-ai-circular-deals/
Sightline Climate / data centre capacity under construction — Ed Zitron analysis, Four Horsemen of the AIpocalypse. https://www.wheresyoured.at/hatersguide-saas/
Talent and research base
MacroPolo Global AI Talent Tracker 3.0 — Paulson Institute. https://archivemacropolo.org/interactive/digital-projects/the-global-ai-talent-tracker/
Paulson Institute press release on Tracker findings — March 2024. https://www.paulsoninstitute.org/press_release/study-finds-us-remains-a-magnet-for-worlds-best-and-brightest-ai-talent-but-more-global-talent-are-staying-home-instead-of-going-abroad/
GPT-4o team composition (6 of 17 trained at Chinese universities) — 36Kr, Half of the World’s AI Talents Are Chinese. https://eu.36kr.com/en/p/3340533396093446
Applied-layer evidence
Airbnb / Brian Chesky on Qwen reliance — Fortune, Chesky says OpenAI tools not ready for ChatGPT tie-up with Airbnb app, 21 October 2025. https://fortune.com/2025/10/21/brian-chesky-openai-tools-not-ready/
Cursor Composer 2 built on Kimi K2.5 — TechCrunch, Cursor admits its new coding model was built on top of Moonshot AI’s Kimi, 22 March 2026. https://techcrunch.com/2026/03/22/cursor-admits-its-new-coding-model-was-built-on-top-of-moonshot-ais-kimi/
Clément Delangue: Chinese open source “the most significant force shaping the global AI tech stack” — KuCoin News, 20 March 2026. https://www.kucoin.com/news/flash/hugging-face-ceo-chinese-open-source-is-the-largest-force-shaping-global-ai-tech-stack
170,000+ Qwen-derivative models — The Wire China, Cheap and Open Source, Chinese AI Models Are Taking Off, November 2025. https://www.thewirechina.com/2025/11/09/cheap-and-open-source-chinese-ai-models-are-taking-off/
Chinese counter-architecture
DeepSeek-V4 release and technical specifications — DeepSeek-AI, DeepSeek-V4-Pro model card.https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro
DeepSeek-V4 architecture analysis — Hugging Face Blog, DeepSeek-V4: a million-token context that agents can actually use. https://huggingface.co/blog/deepseekv4
DeepSeek-V4 trained on Huawei Ascend 950PR chips (Reuters confirmation) — Build Fast With AI, DeepSeek V4-Pro Review. https://www.buildfastwithai.com/blogs/deepseek-v4-pro-review-2026
Regulatory framework
EU AI Act — Official EU AI Act portal. https://artificialintelligenceact.eu/
Tekhora AI Radar — regulatory monitoring across 70+ jurisdictions. https://radar.tekhora.com
Digital Resilience Index — eight-pillar resilience framework, aDRI Foundation. https://thedigitalresilience.org/
KoncentriK cross-references
Damien Kopp, The Dependency Economy of AI — what 25 national AI strategies reveal about sovereignty.https://www.koncentrik.co/p/the-dependency-economy-of-ai
Damien Kopp, Understanding Extraterritorial Overreach — on CLOUD Act and jurisdictional risk in AI deployments. https://www.koncentrik.co/p/understanding-extraterritorial-overreach
Advisory and platform
RebootUp AI Sovereignty and Dependency Audit. https://rebootup.com/digital-sovereignty-audit.html
RebootUp — enterprise AI strategy and transformation advisory. https://www.rebootup.com
Tekhora — Technology Geointelligence platform. https://www.tekhora.com


