The largest empirical study of AI usage ever conducted has just landed, and the findings challenge nearly everything UK businesses assume about AI adoption. OpenRouter’s analysis of 100 trillion tokens across billions of prompt-completion pairs reveals a market far more nuanced—and far more accessible—than vendor marketing suggests.

The real state of AI in 2025

OpenRouter’s December 2025 study represents an unprecedented window into how organisations actually use AI. Spanning November 2024 to November 2025, the research team analysed anonymised metadata from over 300 models and 60+ providers, serving millions of developers with more than half the usage originating outside the United States.

The headline finding will surprise many: open-source models now account for approximately one-third of total AI usage, up from negligible market share just eighteen months ago. Chinese open-source models alone surged from 1.2% in late 2024 to nearly 30% during peak weeks, averaging 13% annually.

Model CategoryMarket ShareKey Players
Proprietary Models70% averageClaude (Anthropic), GPT (OpenAI), Gemini (Google)
Chinese Open Source13% average (30% peak)DeepSeek (14.37T tokens), Qwen (5.59T tokens)
Rest-of-World Open Source13.7% averageMeta LLaMA (3.96T tokens), Mistral (2.92T tokens)

Strategic Reality: The AI market has fragmented faster than anticipated. UK businesses no longer face a binary choice between expensive proprietary solutions and unreliable alternatives—viable open-source options now handle enterprise workloads at scale.

What organisations actually do with AI

The study demolishes assumptions about AI being primarily a productivity tool. When researchers categorised usage patterns, two categories dominated:

Programming surged from 11% of usage in early 2025 to exceeding 50% in recent weeks. Anthropic’s Claude dominates this category with over 60% share, whilst Chinese open-source models now handle 39% of combined programming and technology workloads.

Roleplay accounts for 52% of all open-source token usage—spanning games, creative writing, interactive fiction, and adult content. This entertainment-oriented usage “counters assumptions about productivity-focused applications.”

Critical Context: Over half of open-source AI usage serves creative interactive dialogues rather than business automation. This matters because it reveals untapped capacity—the same models handling roleplay can power customer service, content generation, and training simulations.

Provider specialisation is striking

Each major provider has carved out distinct territory:

  • Anthropic Claude: 80%+ programming and technology, minimal roleplay
  • Google Gemini: Diverse composition spanning legal, science, technology, and general knowledge
  • OpenAI GPT: Shifted toward programming (29%) and technology (29%) by late 2025
  • DeepSeek: Dominated by roleplay and casual interaction (>60%)
  • Qwen: 40-60% programming consistently with volatile science and roleplay usage

Implementation Note: Provider specialisation means your model choice matters more than ever. A model optimised for programming will underperform in customer communication, and vice versa. Multi-model strategies are becoming essential.

The agentic shift transforms everything

Perhaps the most significant finding concerns how AI is being used, not just what for. Reasoning-optimised models now represent over 50% of all tokens processed—up from negligible usage in early Q1 2025.

The numbers tell the story:

MetricNovember 2023November 2025Change
Average prompt tokens~1,500>6,000Nearly 4x increase
Average completion tokens~150~400Nearly 3x increase
Average sequence length<2,000>5,400More than 3x increase

Programming workloads drive this growth, with requests routinely exceeding 20,000 input tokens. The share of tokens from requests with tool invocations shows a “consistent upward trend”—AI is no longer answering questions but executing multi-step tasks.

Strategic Insight: Single-turn question-answer interactions are becoming the minority use case. Organisations planning AI adoption need to design for multi-step, tool-integrated workflows from the start. Retrofitting existing chatbot deployments will prove costly.

Top reasoning models leading this shift include xAI’s Grok Code Fast 1, Google Gemini 2.5 Pro, and Gemini 2.5 Flash. Tool-calling capabilities initially concentrated among Claude Sonnet, Gemini Flash, and GPT-4o-mini, but have since broadened to include newer entrants.

Price matters less than you think

The study’s most counterintuitive finding challenges fundamental procurement assumptions. When researchers analysed cost versus usage across the entire market, they found “nearly flat” correlation between price and adoption.

The estimated elasticity: a 10% price decrease correlates to only 0.5-0.7% usage increase. In other words, demand is relatively price-inelastic for professional workloads.

Four distinct market archetypes emerged:

Premium Leaders (Anthropic Claude, ~$2/1M tokens): High cost sustained by high usage despite premium pricing. These models command loyalty through capability, not affordability.

Efficient Giants (Google Gemini Flash $0.147, DeepSeek V3 $0.394): Low price paired with massive adoption—the volume play that dominates token counts.

Premium Specialists (OpenAI GPT-4 $34/1M, GPT-5 Pro $35/1M): High cost, low usage, serving niche high-stakes workloads where accuracy justifies expense.

Long Tail (Qwen 7B $0.052, IBM Granite $0.036): Rock-bottom pricing with limited adoption despite affordability—proving cheap isn’t sufficient.

SME Advantage: Price-inelastic demand means UK businesses shouldn’t obsess over finding the cheapest model. The efficiency gains from a well-suited model far outweigh marginal cost differences. Invest in capability matching, not price shopping.

The glass slipper effect: why switching is harder than expected

OpenRouter’s researchers identified a phenomenon they call the “glass slipper effect”—early user cohorts achieving persistent workload-model fit that resists substitution despite newer alternatives.

Claude 4 Sonnet (June 2025 cohort) and Gemini 2.5 Pro (May 2025 cohort) both achieved approximately 40% user retention at Month 5. These retention rates correlate with technical breakthroughs enabling “previously impossible workloads.”

Meanwhile, GPT-4o Mini shows a single dominant foundational cohort from July 2024, with all subsequent cohorts exhibiting identical poor performance. Models like Gemini 2.0 Flash and Llama 4 Maverick failed to establish any stable foundational cohort.

Hidden Cost: Switching AI providers isn’t just about rewriting prompts. Users embed workflows around specific model capabilities. Early adoption of the right model creates compounding advantages; early adoption of the wrong model creates compounding technical debt.

DeepSeek models exhibit a curious “boomerang effect”—retention resurrects after initial churn, suggesting users return after testing alternatives and finding them wanting. This pattern validates the importance of genuine capability differentiation over marketing claims.

Geographic expansion reshapes the market

The study reveals AI adoption becoming “increasingly global and decentralised.” Regional token distribution shifted dramatically:

RegionToken ShareTrend
North America47.22%Stable
Asia28.61%Doubled from 13%
Europe21.32%Stable
Rest of World2.85%Combined

English dominates at 82.87% of tokens, but Chinese (Simplified) now accounts for 4.95%, followed by Russian (2.47%), Spanish (1.43%), and Thai (1.03%).

The top five countries by token volume are the United States (47.17%), Singapore (9.21%), Germany (7.51%), China (6.01%), and South Korea (2.88%).

Competitive Reality: Asian markets are adopting AI faster than European markets. UK businesses competing globally need awareness that Asian competitors may have more aggressive AI integration strategies, supported by rapidly maturing domestic model ecosystems.

Strategic recommendations for UK businesses

For organisations beginning their AI journey

The study confirms that model selection matters more than cost. Begin with a focused pilot using a model suited to your primary use case:

  • Programming and technical tasks: Claude Sonnet or Gemini 2.5 Pro
  • Diverse general knowledge: Google Gemini models
  • Cost-sensitive high-volume processing: DeepSeek V3 or Gemini Flash
  • Creative and conversational applications: Open-source alternatives offer genuine capability

Take Action: Audit your intended AI use cases before selecting a model. A mismatch between model strengths and workload requirements will undermine ROI regardless of pricing.

For organisations scaling existing AI implementations

The shift toward agentic workflows demands architectural attention. Average sequence lengths tripling means your infrastructure must handle sustained multi-turn interactions, not just quick queries.

Priority actions by AI maturity:

Maturity LevelImmediate Priority90-Day Focus
ExperimentalDefine 2-3 high-value use cases aligned with model strengthsEstablish measurement baselines
Pilot StageValidate model-workload fit before scalingBuild human-in-the-loop safeguards
ProductionEvaluate multi-model strategiesDesign for tool-calling workflows
OptimisingBenchmark against new model releasesConsider open-source for cost-insensitive workloads

For organisations concerned about AI risk

The glass slipper effect has risk implications. Early lock-in to an underperforming model creates switching costs that compound over time. Conversely, delaying adoption means missing the window to establish workload-model fit whilst competitors build sticky advantages.

Warning: ⚠️ Each breakthrough creates a fleeting launch window for achieving workload-model fit. Early solvers create deep, sticky adoption as users embed workflows around solutions. Strategic delay is itself a risk.

Hidden challenges the study reveals

1. The open-source quality gap is closing faster than procurement cycles

Chinese open-source models went from irrelevant to 30% market share in twelve months. Procurement processes that dismiss open-source options based on 2024 evaluations are already outdated.

Mitigation: Implement quarterly model capability reviews rather than annual vendor assessments.

2. Entertainment usage subsidises enterprise capability

Over half of open-source usage serves roleplay and creative applications. This massive demand funds continued development, meaning enterprise-suitable capabilities arrive as a byproduct of consumer entertainment investment.

Mitigation: Monitor open-source model development even if you don’t plan to deploy them—their capabilities indicate where proprietary models must compete.

3. Tool-calling is the new battleground

The consistent upward trend in tool-integrated workflows means today’s chatbot deployments will feel primitive within eighteen months. Organisations that haven’t designed for agentic interaction will face expensive retrofits.

Mitigation: Architect AI deployments for multi-step tool calling from inception, even if initial implementations are simpler.

4. Regional model preferences create interoperability challenges

Asian markets increasingly favour Chinese open-source models. UK businesses with Asian operations or partnerships may need separate model strategies for different regions—adding complexity to governance frameworks.

Mitigation: Include regional model availability in AI strategy planning, particularly for multinational operations.

What this means for your AI strategy

The OpenRouter study reveals an AI market that is simultaneously more accessible and more complex than conventional wisdom suggests. Open-source models now deliver enterprise-grade performance. Price barely influences adoption decisions. And the shift toward agentic, multi-step workflows is transforming what “using AI” actually means.

Three success factors for UK businesses:

  1. Match models to workloads, not budgets. The minimal correlation between price and adoption confirms that capability fit drives value. A well-matched model at higher cost outperforms a cheaper misaligned alternative.

  2. Design for agentic workflows from the start. With sequence lengths tripling and tool-calling becoming standard, single-turn chatbot implementations will require expensive architectural overhauls.

  3. Move decisively on model selection. The glass slipper effect means early adopters of well-matched models build compounding advantages. Delayed decision-making risks permanent competitive disadvantage.

Success Factor: Retention patterns serve as empirical signals of model differentiation. Persistent cohorts indicate capability inflection points where models transition workloads from infeasible to possible.


Ready to align your AI strategy with these market realities? Our AI Strategy Blueprint helps UK businesses identify high-value use cases and match them to appropriate models—delivering clarity in 5 working days, not months of experimentation.

For organisations already deploying AI, our AI Integration service builds production-ready workflows with human-in-the-loop safeguards and break-fix warranties.


Source: Aubakirova, M., Atallah, A., Clark, C., Summerville, J., & Midha, A. (2025). State of AI 2025: An Empirical 100 Trillion Token Study with OpenRouter. OpenRouter Inc. https://openrouter.ai/state-of-ai

Strategic analysis by Resultsense. We transform technical research into actionable business intelligence for UK organisations navigating AI adoption.