Futures
Access hundreds of perpetual contracts
CFD
Gold
One platform for global traditional assets
Options
Hot
Trade European-style vanilla options
Unified Account
Maximize your capital efficiency
Demo Trading
Introduction to Futures Trading
Learn the basics of futures trading
Futures Events
Join events to earn rewards
Demo Trading
Use virtual funds to practice risk-free trading
CFD
U.S. stock CFD derivatives
US Stocks
Access real US stocks and ETFs
HK Stocks
Trade quality Hong Kong-listed stocks
Korean Stocks
SK Hynix
Real Korean stocks and top assets
Stock Futures
High leverage, 24/7 trading
Tokenized Stocks
Backed by real stock assets
IPO Access
Unlock full access to global stock IPOs
GUSD
Mint GUSD for Treasury RWA yields
Stocks Activities
Trade Popular Stocks and Unlock Generous Airdrops
Launch
CandyDrop
Collect candies to earn airdrops
Launchpool
Quick staking, earn potential new tokens
HODLer Airdrop
Hold GT and get massive airdrops for free
IPO Access
Unlock full access to global stock IPOs
Alpha Points
Trade on-chain assets and earn airdrops
Futures Points
Earn futures points and claim airdrop rewards
Promotions
AI
Gate AI
Your all-in-one conversational AI partner
Gate AI Bot
Use Gate AI directly in your social App
GateClaw
Gate Blue Lobster, ready to go
Gate for AI Agent
AI infrastructure, Gate MCP, Skills, and CLI
Gate Skills Hub
10K+ Skills
From office tasks to trading, the all-in-one skill hub makes AI even more useful.
When AI Billings Go Out of Control, Model Routers Become the New Cost-Saving Darling for Enterprises
As enterprise AI usage costs continue to climb, a technology known as "model routers" is rapidly moving from niche tools to the mainstream. These systems automatically allocate the most suitable AI model based on task complexity, significantly reducing expenses without notably sacrificing quality, attracting widespread attention from startups to large enterprises.
The core logic of model routers is that not all tasks require the most expensive frontier models. Basic work such as summarizing emails or retrieving documents can be handled by open-source models or older proprietary models at a fraction of the cost of top-tier models. Companies like Snowflake and Palo Alto Networks have confirmed to The Information that they have achieved substantial cost savings by replacing specific tasks with cheaper models.
This trend is generating real business returns. Construction firm McCarthy Building reported that through Palantir's routing tool Evolve, its quarterly AI token usage dropped 60% compared to the same period last year. Palantir itself disclosed that in one specific case, the tool reduced computing costs by 97% by switching tasks from OpenAI's GPT-5.1 to the smaller GPT-5.4 Nano model.
From Manual Model Selection to Automatic Routing: An Industry Turning Point
The concept of model routers is not entirely new, but it truly entered the public eye after OpenAI released GPT-5. This model automatically switches between different models within ChatGPT based on the complexity of user prompts, embedding routing logic directly into the product. Since then, routers capable of scheduling models across multiple providers have rapidly proliferated.
Currently, routers on the market come in various forms: standalone products, built-in modules from cloud service providers, and custom solutions built by enterprise IT departments. The common goal of these tools is to replace manual model selection by users, thereby reducing costs while maintaining output quality.
Databricks' Unity AI Gateway is one example. CEO Ali Ghodsi said the tool is "very popular" because many enterprises "are burning through their budgets too quickly." Databricks had been using it internally for some time before rolling it out to customers.
From Startups to Tech Giants: Full Participation
The router track is attracting players of all sizes. According to a previous report by The Information, in April, startup OpenRouter, which provides routing technology, completed a new $120 million funding round, reflecting strong capital market enthusiasm for this direction.
OpenRouter's "automatic router" decides which model to call based on user preferences for cost and quality (set on a scale of 0 to 10). Data shows that the router selects Google's relatively inexpensive Gemini 2.5 Flash Lite about one-third of the time, while calling OpenAI's more powerful GPT-5.5 only about 10% of the time. OpenRouter's automatic router is powered at its core by startup Not Diamond, which specializes in developing routing systems for AI coding agents.
Japanese AI lab Sakana AI recently released a router-based multi-model collaborative system. In tests, the system mainly assigned math problems to OpenAI's GPT-5.5 and science problems to Google's Gemini, reasoning that the system judged these two models as superior to other options in their respective domains. Sakana AI claims the system's overall performance on benchmarks such as programming, engineering, scientific tasks, and reasoning is "on par" with Anthropic's Fable 5 and Mythos Preview models.
AI coding application Cognition also released a new router this week, using its internal benchmarks to identify the relative strengths of different agents and introducing a "sidekick" agent to handle simpler tasks. Cognition stated that the router achieved score levels matching Fable 5 on a certain coding benchmark, but at 35% lower cost.
DIY Routing: Low-Cost Solutions Also Work
Not all enterprises need to buy specialized routing products. Developers can build their own routers using AI coding agents like Claude Code, or even directly let an AI model decide which model is best suited for a specific query.
Hunter Bown, who works on AI agents at Arcee AI, said he habitually uses DeepSeek V4 Flash for model selection because of its low cost. His approach is to provide DeepSeek with a list of models and let it determine which model is best for handling the current prompt.
However, such "quick-build" solutions have their limitations. Shriyash Upadhyay, founder of router provider Martian, pointed out that more complex routers sometimes show impressive benchmark scores but may not match them in actual performance. He also noted that even with more sophisticated routers, predicting the best model based solely on the user's first prompt is quite challenging.
Upadhyay said that the rapid pace of model iteration and constantly changing capability differences make routing decisions increasingly complex. "Companies don't have infinite data on all different tasks, so you have to really go deep into the models to figure out what they're good at." To this end, when making routing decisions, Martian not only considers the output results of models but also examines the internal computational processes that constitute these models.
Cost Pressure Persists, Demand for Routers Expected to Grow
Enterprise anxiety over AI costs is not a short-term phenomenon. As employee usage of advanced AI models (the "tokenmaxxing" phenomenon) continues to increase, management scrutiny of AI spending is also intensifying. This backdrop provides sustained demand drivers for model routers.
Beyond routing functionality, Palantir's Evolve tool can automatically adjust prompt content based on the selected model and prevent requests from being sent repeatedly to the model—one common cause of overcharging. The McCarthy Building case shows that by optimizing prompt structure, enterprises can consume fewer tokens while using frontier models and still get the same output.
For investors, the warming of the model router track means: on one hand, startups like OpenRouter focused on routing technology are gaining capital favor; on the other, companies like Databricks and Palantir, which integrate routing capabilities into enterprise AI platforms, are using this to strengthen their product competitiveness. As AI infrastructure spending continues to expand, the tool layer that helps businesses control this spending is becoming an emerging market that cannot be ignored.
Risk Warning and Disclaimer