Futures
Access hundreds of perpetual contracts
CFD
Gold
One platform for global traditional assets
Options
Hot
Trade European-style vanilla options
Unified Account
Maximize your capital efficiency
Demo Trading
Introduction to Futures Trading
Learn the basics of futures trading
Futures Events
Join events to earn rewards
Demo Trading
Use virtual funds to practice risk-free trading
CFD
U.S. stock CFD derivatives
US Stocks
Access real US stocks and ETFs
HK Stocks
Trade quality Hong Kong-listed stocks
Korean Stocks
SK Hynix
Real Korean stocks and top assets
Stock Futures
High leverage, 24/7 trading
Tokenized Stocks
Backed by real stock assets
IPO Access
Unlock full access to global stock IPOs
GUSD
Mint GUSD for Treasury RWA yields
Stocks Activities
Trade Popular Stocks and Unlock Generous Airdrops
Launch
CandyDrop
Collect candies to earn airdrops
Launchpool
Quick staking, earn potential new tokens
HODLer Airdrop
Hold GT and get massive airdrops for free
IPO Access
Unlock full access to global stock IPOs
Alpha Points
Trade on-chain assets and earn airdrops
Futures Points
Earn futures points and claim airdrop rewards
Promotions
AI
Gate AI
Your all-in-one conversational AI partner
Gate AI Bot
Use Gate AI directly in your social App
GateClaw
Gate Blue Lobster, ready to go
Gate for AI Agent
AI infrastructure, Gate MCP, Skills, and CLI
Gate Skills Hub
10K+ Skills
From office tasks to trading, the all-in-one skill hub makes AI even more useful.
Countdown to the End of the AI High-Pricing Era? Five Structural Reasons Why Tokens Must Drop in Price
Diminishing marginal performance improvements, open-source models costing only a tenth, specialized chips slashing inference costs, zero switching costs allowing users to jump ship instantly—local models could end subscription models within 4 to 5 years. Is the room for AI giants to maintain high pricing narrowing rapidly?
(Previous: OpenAI flagship model GPT-5.6 Sol debuts exclusively on Cerebras; "White-haired stock guru" Serenity says "technology validated" and enters to buy the dip)
(Background: Citrini Research: Avoid the AI bubble! Names "5 high-profit blind spots" facing capital rotation)
Table of Contents
Toggle
Software engineer Aditya Patadia pointed out in his personal blog: Uber burned through its entire year's AI budget in 4 months, and Microsoft, Salesforce, and GitHub have also announced plans to control employee AI spending. This is a common dilemma across the entire industry, not just a financial discipline issue for individual companies. But he predicts that the expensive pricing structure of current top AI companies is about to reverse.
Double Squeeze from Performance Ceiling and Open Source
Patadia's first observation: Model performance improvements are diminishing marginally. Each iteration of a model still brings progress, but the gains are getting smaller, and the issue with training data is structural—major AI labs have likely already digested nearly all digitized written knowledge in human history, making further improvement to training sets extremely difficult.
He cites that Claude Opus 4.8 and Claude Opus 4.7 are priced the same as evidence: when models can no longer demonstrate significant leaps across generations, the justification for price increases disappears, leaving only price cuts as a competitive option.
The second pressure comes from the open-source camp. Using GLM-5.2 as an example, this open-source model is comparable to GPT 5.5 and Claude Opus in code benchmark testing, yet its pricing is only one-tenth of GPT 5.5, creating an overwhelming pricing advantage.
Patadia's judgment: As long as open-source models continue to narrow the performance gap with closed-source flagships, the pricing room for closed-source models will keep shrinking.
Chip Revolution and Zero Switching Costs
Another pressure line for AI pricing comes from the hardware side. Patadia points out that AI-specific chips developed by companies like Cerebras, Groq, and Google are rewriting the baseline of inference costs. For example, Google's TPU offers inference costs 30% to 70% cheaper than Nvidia's H100 GPU.
Simply put, the same computational load can save a significant amount of money by using the right chip, and this gap directly compresses the pricing floor of model service providers. Beyond chips, model architectures themselves are also reducing costs: caching mechanisms mean repeated queries don't need to be recomputed, and Mixture of Experts (MoE) architecture—in layman's terms—allows the model to call only part of the "experts" on demand, without activating all neurons every time, significantly reducing computational overhead while maintaining equivalent accuracy.
There's another factor Patadia believes is the most underestimated structural element: zero switching costs.
His comparison is straightforward: the moats of traditional software like Windows, Adobe, and Salesforce lie in the fact that replacing them is extremely costly, often requiring months of migration engineering. AI models have no such moat. AI gateway services like OpenRouter.ai allow developers to switch between model providers in seconds, and can even programmatically have systems auto-switch between different providers.
When competitors can be replaced instantly at any time, any attempt by a vendor to raise prices will directly drive users away.
Local Models: The Ultimate Threat to Subscription Models
Patadia's boldest prediction points to local models. His estimate is within 4 to 5 years: continued improvements in chip performance, coupled with the inevitable decline in RAM prices, will allow consumer-grade computers and smartphones to run language models locally. He further predicts that mainstream operating systems will come with built-in model deployment interfaces, enabling local applications to call local models directly.
If this scenario materializes, what does it mean? Cloud models would only be needed for the most complex tasks—legal document analysis, long-context reasoning, cross-database integration. Everyday tasks like code auto-completion, document proofreading, and basic fact-checking would be done locally, eliminating the need for monthly cloud subscription fees of $20 or even $200.
Of course, Patadia himself notes that this is a "prediction," not a certainty, and he calls these his "bold bets"—time will tell. But the five pressure directions above—diminishing performance gains, rising open-source alternatives, specialized chip cost reduction, zero switching costs, and local model substitution—all have real-world cases supporting them, not just thought experiments.
If Patadia's predictions are correct, that's good news for users. But for AI companies charging money? That's a different story.