Futures
Access hundreds of perpetual contracts
CFD
Gold
One platform for global traditional assets
Options
Hot
Trade European-style vanilla options
Unified Account
Maximize your capital efficiency
Demo Trading
Introduction to Futures Trading
Learn the basics of futures trading
Futures Events
Join events to earn rewards
Demo Trading
Use virtual funds to practice risk-free trading
CFD
U.S. stock CFD derivatives
US Stocks
Access real US stocks and ETFs
HK Stocks
Trade quality Hong Kong-listed stocks
Korean Stocks
SK Hynix
Real Korean stocks and top assets
Stock Futures
High leverage, 24/7 trading
Tokenized Stocks
Backed by real stock assets
IPO Access
Unlock full access to global stock IPOs
GUSD
Mint GUSD for Treasury RWA yields
Stocks Activities
Trade Popular Stocks and Unlock Generous Airdrops
Launch
CandyDrop
Collect candies to earn airdrops
Launchpool
Quick staking, earn potential new tokens
HODLer Airdrop
Hold GT and get massive airdrops for free
IPO Access
Unlock full access to global stock IPOs
Alpha Points
Trade on-chain assets and earn airdrops
Futures Points
Earn futures points and claim airdrop rewards
Promotions
AI
Gate AI
Your all-in-one conversational AI partner
Gate AI Bot
Use Gate AI directly in your social App
GateClaw
Gate Blue Lobster, ready to go
Gate for AI Agent
AI infrastructure, Gate MCP, Skills, and CLI
Gate Skills Hub
10K+ Skills
From office tasks to trading, the all-in-one skill hub makes AI even more useful.
OpenRouter: How to Turn “Model Intermediary” into a Billion-Dollar Company
Author: Ella Zhang
Today, let's talk about model hubs.
Simply put, a model hub connects different models like OpenAI, Claude, Gemini, DeepSeek, etc., behind a single entry point, allowing developers to call multiple models with one set of APIs, one account, and unified billing, and to choose, switch, and fallback between different models or providers.
Of course, for domestic users, the bigger reason to use a model hub is to access overseas models and to save money.
Everyone understands this, so we won't go into detail about domestic model hubs. Today, we'll focus on OpenRouter.
By 2026, OpenRouter had raised $113 million in Series B funding, with a valuation approaching $1.3 billion.
In other words, it has already become a unicorn company.
Let's analyze why a model hub that "doesn't build models" is worth so much money.
What exactly does OpenRouter do?
OpenRouter officially positions itself as: a unified large model interface.
OpenRouter now supports over 400 models and more than 70 model providers.
The official website also discloses that the platform processes 100 trillion tokens per month, with over 10 million global users.
The Series B funding announcement in May 2026 also mentioned that over the past 6 months, OpenRouter's weekly processing volume grew from 5 trillion tokens to 25 trillion tokens, serving over 1M developers.
These numbers indicate one thing:
OpenRouter is no longer a niche developer tool; it has become a major AI calling gateway.
The way developers use it is also very simple.
Originally, you had to connect to models from OpenAI, Anthropic, Google, DeepSeek, Mistral, xAI, etc., separately.
For each, you had to read documentation, apply for an API key, set up billing, handle interface differences, check rate limits, and manage error handling.
After using OpenRouter, developers can call different models through the same interface.
In many cases, code that originally used the OpenAI interface only needs to change the base URL, swap the API key, and specify the model name to call other models via OpenRouter.
This is one of the reasons for its early rapid growth: low migration cost.
Why don't developers connect directly to model companies?
It seems that developers can bypass OpenRouter entirely and apply for APIs directly from model company websites.
But in real development, it's not that simple.
If an AI product is just a demo, using one model is enough. But once it enters real business, relying on a single model becomes difficult.
For example, an AI writing tool might have several different tasks:
Generating titles: a cheap model is sufficient;
Writing long articles: requires stronger text capabilities;
Analyzing materials: needs a long-context model;
Content moderation: requires low-cost, highly stable classification;
Enterprise customers demand that data not be retained, so you must choose a provider that complies with data policies;
During peak times, models may be rate-limited, requiring automatic fallback to a backup model.
At this point, the problem is no longer just "connecting one API."
The team needs to maintain a complete model calling system:
Which model handles which task, which model is cheaper, which provider is faster, which provider has a lower failure rate, how to switch when problems occur, how to attribute billing, and how to isolate data for enterprise customers.
Even more troublesome is that the model market changes too fast.
Today, Claude is good for coding; tomorrow, Gemini's long context has an advantage; the day after, DeepSeek or some open-source model drives down prices.
Model capabilities, prices, context lengths, and provider policies are constantly changing.
This is where OpenRouter's value lies.
It doesn't write AI applications for developers, but it helps developers manage "which model to use, how to call it, how to ensure fallback, and how to control costs."
Not just a model supermarket, but a model orchestration layer
If you only understand OpenRouter as a "model supermarket," you underestimate it.
A model supermarket solves the problem of "there are many models here, you can choose."
But OpenRouter's truly important capability is orchestration between models and providers.
The same model may be served by different providers.
For example, an open-source model can be hosted by multiple cloud service providers or inference service providers. Different providers have different prices, speeds, and stability.
OpenRouter's documentation has a capability called provider routing.
Developers can set conditions based on price, latency, throughput, provider order, etc., to automatically route requests to different providers.
It also supports fallback, meaning if a model or provider fails, the system automatically switches to a backup.
For developers, OpenRouter essentially extracts "model selection" and "error handling" from business code and hands them over to a dedicated platform.
Why would enterprises need this layer?
When enterprises adopt AI, the initial problem is often "can it work," but it quickly becomes "how to manage it."
A company may have many teams using AI.
The marketing team uses it to create content, the customer service team to reply to users, the R&D team to write code, the operations team to analyze data, and the legal team to process contracts.
If each team connects models on their own, problems multiply:
Billing is unclear; model selection is inconsistent;
Data policies are opaque; different teams duplicate integration;
When problems arise, no one knows which call caused them;
When a model provider changes, the system cannot adapt uniformly.
OpenRouter's workspaces, budget controls, call logs, provider policies, and zero-data-retention routing all address these issues.
For example, zero data retention.
For many enterprises, not all requests can be sent to any model provider arbitrarily. Customer information, contract content, medical data, and financial data may have strict requirements.
OpenRouter's documentation supports Zero Data Retention.
Developers can set requests to only be sent to providers that do not store data. This policy can be applied globally, by model group, by security rules, or per request.
Another example is prompt caching.
Many AI applications repeatedly use long system prompts, knowledge base content, or context. If recalculated every time, costs can be high.
OpenRouter supports provider-affinity routing to increase cache hit rates, trying to route subsequent requests to the same provider endpoint, thereby reducing the cost of repeated context.
These features don't sound sexy, but they are very practical, and the larger the AI application scale, the more obvious the cost savings.
How does OpenRouter make money?
OpenRouter's business model is clear: it makes money based on usage.
Developers first purchase platform credits, then pay based on the models actually called and tokens consumed.
OpenRouter's official statement is clear:
The platform charges a 5.5% fee when purchasing credits, with a minimum of $0.80; the underlying model provider's prices are passed on at cost, with no additional markup on model inference prices.
This is a typical "traffic toll" business.
The advantage of this model is that revenue is tied to usage.
The more developers call, the higher the platform's revenue; the more AI applications and tokens consumed, the bigger OpenRouter's business.
But it also has a characteristic: the per-call cut is not high, so it must rely on scale.
That's why token processing volume is so important to OpenRouter.
Its core metric is not registered users, but how many tokens flow through it per week or per month.
In 2025, OpenRouter's annual processing volume grew from about 10 trillion tokens to over 100 trillion tokens.
By 2026, OpenRouter had reached an annualized processing volume of about 1.5 quadrillion tokens.
That's the underlying logic of this business.
As long as more and more AI applications run on multi-model systems, OpenRouter can continuously extract service fees from these calls.
Why has it grown so fast recently?
OpenRouter's growth can be summarized as benefiting from three changes.
The first change is the increasing number of models.
In the past, when building AI applications, many teams defaulted to OpenAI first. Now it's different.
Claude, Gemini, DeepSeek, Qwen, Mistral, Llama, Grok, and a large number of open-source and open-weight models all have advantages in different scenarios.
This is not a market where "one completely replaces another."
Some models are good at coding, some are cheap, some have strong long-text capabilities, some are fast, some are good at role-playing, some are suitable for enterprise documents, and some are good at multimodal.
More models mean higher selection costs; higher selection costs make the middle layer more valuable.
The second change is that AI applications are starting to focus on cost.
Many products initially use the strongest model because they need to get the effect right first.
But once a product has users, model costs quickly become a problem.
A customer service bot, AI search product, code assistant, or content generation tool will have its gross margin eaten away if all requests go through the most expensive model.
A more mature approach is to split tasks:
Simple tasks use cheap models;
Complex tasks use strong models;
High-frequency tasks prioritize low-latency models;
After failure, switch to backup models;
When sensitive data is involved, only use providers that comply with data policies.
This is precisely OpenRouter's use case.
It doesn't necessarily help you find the "strongest model," but it can help you balance effect, price, speed, and stability.
The third change is that AI applications are moving from chat interfaces to agents.
Agents call tools, read files, search the web, execute tasks, and make continuous multi-round model calls.
Compared to regular chat, agents consume more tokens and rely more on stability.
This is good for OpenRouter.
Because the more calls and longer the chain, the more developers need routing, fallback, logging, cost control, and provider management.
That's why OpenRouter's funding announcement emphasized that AI is moving from experimentation to critical production applications and agent scenarios.
Its growth essentially comes from the increase in AI call volume.
This business also has risks
OpenRouter is in a good position, but it's not safe.
It sits between model companies, cloud providers, and application developers. This position is valuable but also vulnerable to squeeze.
The first risk is that big companies may build their own.
For small teams, OpenRouter is convenient.
But for large enterprises, model routing, permissions, logging, and cost management can also be done in-house or handed over to cloud providers.
Especially for financial, healthcare, and government clients, they may care more about data control and private deployment.
For OpenRouter to serve these clients, it can't just rely on "having many models." It must deeply develop permissions, auditing, data policies, provider management, and enterprise support.
The second risk is that cloud providers will also build model gateways.
Cloud platforms like AWS, Google Cloud, and Azure already have enterprise clients, billing systems, permission systems, and compliance capabilities.
They can easily incorporate multi-model calling, routing, monitoring, and cost management as part of their cloud services.
OpenRouter's advantage is openness and neutrality, with broader model coverage and faster integration.
But cloud providers' advantage is customer relationships and enterprise procurement processes. This will be a long-term competition.
The third risk is the relationship with model providers.
OpenRouter brings traffic to model companies, but also keeps model companies one step away from the end developers.
As the platform grows larger, it will control more user relationships and model usage data.
Model providers hope to gain distribution but also worry that their bargaining power will be weakened.
Such intermediary platforms are usually welcomed by suppliers early on; as they scale, the relationship becomes more delicate.
The fourth risk is that platform fees may be compressed.
OpenRouter charges a 5.5% platform fee, which seems low for now.
But if similar services proliferate, developers will compare prices, stability, model coverage, and enterprise features.
If some competitors offer lower fees, or if cloud providers bundle such capabilities into existing services, OpenRouter will need to prove it's more than just a "request forwarder."
It must continuously provide better routing, stronger model coverage, transparent pricing, stable service, and more complete enterprise controls.