Futures
Access hundreds of perpetual contracts
CFD
Gold
One platform for global traditional assets
Options
Hot
Trade European-style vanilla options
Unified Account
Maximize your capital efficiency
Demo Trading
Introduction to Futures Trading
Learn the basics of futures trading
Futures Events
Join events to earn rewards
Demo Trading
Use virtual funds to practice risk-free trading
CFD
U.S. stock CFD derivatives
US Stocks
Access real US stocks and ETFs
HK Stocks
Trade quality Hong Kong-listed stocks
Stock Futures
High leverage, 24/7 trading
Tokenized Stocks
Backed by real stock assets
IPO Access
Unlock full access to global stock IPOs
GUSD
Mint GUSD for Treasury RWA yields
Stocks Activities
Trade Popular Stocks and Unlock Generous Airdrops
Launch
CandyDrop
Collect candies to earn airdrops
Launchpool
Quick staking, earn potential new tokens
HODLer Airdrop
Hold GT and get massive airdrops for free
IPO Access
Unlock full access to global stock IPOs
Alpha Points
Trade on-chain assets and earn airdrops
Futures Points
Earn futures points and claim airdrop rewards
Promotions
AI
Gate AI
Your all-in-one conversational AI partner
Gate AI Bot
Use Gate AI directly in your social App
GateClaw
Gate Blue Lobster, ready to go
Gate for AI Agent
AI infrastructure, Gate MCP, Skills, and CLI
Gate Skills Hub
10K+ Skills
From office tasks to trading, the all-in-one skill hub makes AI even more useful.
How much of the subscription fee you paid to Claude can the optical module company get?
TL;DR
A chart estimating how Claude Pro’s approximately $20 monthly payment in the U.S. is split among model companies, cloud computing, GPU depreciation, electricity, and supply chains is prompting investors to reconsider how AI application revenues should be valued.
This chart is not official revenue sharing data from Anthropic, Amazon Cloud, or Nvidia, nor should it be taken as a real ledger of any company. Its value lies in raising a fundamental question: how much of the subscription fee paid by users for AI applications can be retained as software gross profit, like traditional SaaS?
Valuation assumptions for traditional SaaS are quite clear. After the software is developed, selling an additional account usually involves minimal incremental costs, and mature pure software companies often have gross margins of 70% or even over 80%. Investors are willing to assign high multiples because, as revenue scales, profit margins can continue to improve.
The challenge with AI applications is that each user query, code generation, file analysis, or agent call consumes GPU time, electricity, memory bandwidth, and cloud resources behind the scenes. While it appears as a fixed monthly fee on the surface, underlying costs vary with usage. Light users may have high margins, but heavy users running continuous tasks within available quotas or tool packages can see costs rise rapidly.
Therefore, the breakdown chart for $20 aims to challenge not just how much a company takes from each dollar, but whether “AI application revenue is inherently equal to SaaS revenue.” AI companies need to prove their worth at high multiples, not only by demonstrating user willingness to pay but also by showing that gross margins weighted by usage can sustainably improve.
A Cost Chain of Inference Behind Subscription Fees
The biggest difference between AI subscriptions and regular software subscriptions is that the marginal cost of “one-time use” is no longer near zero.
In traditional SaaS, adding another account for a team incurs server, customer service, and bandwidth costs, but these typically do not increase linearly with each click. The truly expensive parts are R&D, sales, and customer acquisition upfront. Once scaled, a significant portion of new revenue can be profitably retained.
Model products are different. When users input questions and the model generates answers—called inference—this is actual computation when the model is invoked. Tokens are the basic units of text read and written by the model. The more questions asked, the longer the context, and the more complex the output, the more tokens and compute are consumed.
This creates a contradiction between fixed subscription fees and variable costs. For example, Claude Pro’s U.S. monthly fee of about $20 can be affected by region, taxes, and Anthropic’s adjustments. Users see a fixed price, but the model company faces highly variable usage behaviors—some users just send emails or look up information, while others process long documents, run code tasks, or invoke complex automation.
The market’s split chart attempts to visualize this: within the $20, part goes to the model company, and part to cloud and compute providers. Compute costs include electricity, maintenance, and GPU depreciation. GPU procurement then flows up to Nvidia, TSMC, HBM (high-bandwidth memory) suppliers, optical modules, ODMs, and power-related companies.
“GPU depreciation” here can be understood as expensive GPUs not being a one-time purchase but gradually amortized over their lifespan, usage intensity, or accounting periods. Actual allocation depends on package limits, user load ratios, internal cloud settlement prices, reserved capacity discounts, GPU utilization, and depreciation periods. Average costs are not the same as marginal costs.
Investors should focus on the direction: AI application companies cannot just disclose revenue growth; they must also answer whether the underlying compute costs are growing in tandem. If usage expands faster than model efficiency improves, higher subscription revenue could lead to increased gross margin pressure. Only if efficiency gains are rapid enough can model companies approach the profit structure of traditional software firms.
Infrastructure First to Secure More Stable Revenue
Currently, growth in AI usage more directly flows into infrastructure rather than being fully retained at the application layer.
Whether users are on Claude, ChatGPT, Gemini, or internal enterprise agents, inference ultimately depends on compute, electricity, memory, and network resources. The application layer may change products, but underlying resource consumption remains rigid. As AI usage continues to rise, cloud capital expenditure, GPU procurement, HBM demand, and data center power consumption will be driven upward.
This is why Nvidia, TSMC, SK Hynix, and other infrastructure players are continually revalued by the market. Nvidia’s overall gross margin has been high recently, with FY2026 GAAP and non-GAAP gross margins around 71.1% and 71.3%, respectively, with future quarterly guidance remaining high. Note that some quarters are affected by specific expenses, and public financial reports do not always directly reveal the true gross margin structure of AI data centers, but the scarcity value of infrastructure with pricing power is reflected in their performance.
HBM is a typical component in this chain. It’s not ordinary memory but a critical part supporting high-throughput AI accelerators. As model size, context length, and inference concurrency increase, dependence on high-bandwidth memory grows. Supply chain estimates show HBM’s share of new AI chip costs is rising, which is a key reason why SK Hynix, Samsung, and Micron are being repriced during the AI cycle.
Electricity and data centers are shifting from background costs to investment priorities. While a single query on plain text may not be energy-intensive, complex agents, long contexts, code generation, and multi-turn tasks amplify compute demands. For cloud providers and data center operators, the key isn’t just how much power a single query consumes, but how continuous inference requests impact cluster utilization, electricity prices, cooling, data center capacity, and grid access—costs and bottlenecks.
The advantage of infrastructure is faster validation of performance. Cloud providers’ AI capital expenditures are already underway, Nvidia’s revenue and gross margins are reflected in earnings reports, and HBM vendors’ orders and prices will soon impact profits. Most transactions at the application layer are based on future expectations: subscription conversions, enterprise penetration, API revenue, and profit realization from declining future costs.
Efficiency Improvements Remain the Core for Bulls
Software investors and AI bulls are not without counterarguments. The core view of optimists is that the current high inference costs are just early-stage phenomena; model optimization, caching, smaller models, in-house chips, and higher cluster utilization will continue to lower unit costs. If costs decline fast enough, AI applications could revert to the high-margin SaaS logic.
This counterargument has a basis in reality. Some mainstream models with comparable or higher capabilities have already seen significant price reductions per token. OpenAI disclosed that GPT-4o mini’s token cost is 99% lower than early text-davinci-003. Different companies’ paces vary; Anthropic recently focused on upgrades at the same price point and model layering, but industry direction remains to deliver more powerful capabilities at lower costs.
Model companies also have various ways to improve unit economics. Simple tasks can be delegated to smaller models, common requests reused via caching, and complex tasks handled by stronger models. Cloud providers reduce unit compute costs through in-house chips and cluster scheduling. Google has TPU, Microsoft has Maia for inference, and Amazon is advancing Trainium and Inferentia.
If only considering technological progress, AI application margins do have room for improvement. Cheaper inference, better model routing, stronger compression, and the ability to carry more usage within the same $20 subscription all help. Tiered enterprise pricing, API tiering, and stricter usage limits can also improve overall unit economics.
The difficulty is that cost reduction is not the only variable. AI applications are moving from simple chat to more demanding workloads. Users now demand code agents, long document processing, video and multimodal generation, and enterprise automation—scenarios with higher value and higher consumption. The more useful the model, the more likely users are to entrust it with complex, long-duration tasks.
This makes the divergence more specific: whether the rate of inference cost decline can outpace growth in usage and task complexity. If unit costs fall rapidly but average consumption per user grows faster, the weighted gross margin of the model company will still be under pressure. Conversely, if model routing, caching, in-house chips, and tiered pricing are effective enough, AI subscriptions could gradually shed their heavy-cost characteristics.
Subscription Users Are Not the Same as Gross Margin
The $20 breakdown chart should not be seen as the final answer. It’s more like a valuation reminder at this stage: when the market cannot yet see enough transparent gross margin data from model companies, investors need to discount the assumption that “AI applications are inherently SaaS-like.”
For unlisted model companies like OpenAI and Anthropic, external investors find it difficult to see full financials. Funding materials, partner disclosures, cloud cost structures, enterprise package prices, API revenue share, and usage restrictions all serve as clues. The most valuable data isn’t how many paid users there are, but the proportion of light versus heavy users, whether enterprise clients are willing to pay more for high-intensity use, whether cloud costs are decreasing, and whether unit inference costs are falling enough to improve gross margins.
For listed companies, validation will appear more quickly in financial reports. Nvidia’s overall gross margin and data center revenue growth, TSMC’s advanced process and packaging demand, HBM vendors’ prices and margins, and cloud providers’ capital expenditure intensity will continue to reflect whether AI usage is still flowing into infrastructure. If these indicators remain strong but the application layer shows no evidence of gross margin improvement, the market will continue to assign a higher valuation premium to infrastructure.
Ultimately, for model companies to regain higher valuation anchors, they need to demonstrate not just user willingness to pay $20, but that these subscriptions can still generate sufficient gross profit after heavy usage. The next pricing divergence will likely not be in headline ARR figures but in whether inference costs, package limits, and enterprise pricing can all be aligned.
Click to learn more about Rhythm BlockBeats’ job openings
Join Rhythm BlockBeats’ official community:
Telegram Subscription Group: https://t.me/theblockbeats
Telegram Discussion Group: https://t.me/BlockBeats_App
Twitter Official Account: https://twitter.com/BlockBeatsAsia