ME AI Message, according to Beating Monitoring, Vercel released the AI Gateway Production Index for June 2026. The report shows that, driven by the launch in May of the DeepSeek V4 series (including the Flash and Pro models) on Vercel Gateway, DeepSeek’s token traffic share surged from less than 1% to 17% within a single month, surpassing OpenAI (13%) to take third place. However, due to extremely low pricing, the total cost for all users using DeepSeek accounts for only about 1% of the gateway’s total funding expenditure. Pricing is the main reason behind DeepSeek’s rapid breakout.

DeepSeek V4 Flash charges only $0.14 for million-token input and $0.28 for output, which is 20 to 50 times cheaper than comparable leading frontier models from Anthropic, and also 8 to 12 times cheaper than Qwen 3.6 Plus and Kimi K2.6. Evaluations indicate that DeepSeek V4 meets performance requirements, prompting the development team to deploy it quickly in production.

Despite the traffic boom for low-cost models, frontier models still dominate funding consumption. In May, Anthropic’s spending share rose from 61% to 65%, accounting for 70% to 80% of spending in high-difficulty scenarios such as application generation, back-end intelligent agents, and programming. For example, in the programming intelligent agent scenario, DeepSeek contributed 49% of token traffic but only accounted for 4% of costs, while Anthropic used 28% of traffic yet consumed 70% of the funds.

The development team is managing budgets through intelligent routing—diverting high-frequency, low-risk tasks to low-cost models and using frontier models only at critical points. Considerations of return on investment (ROI) have also slowed model upgrades. For instance, Google’s Gemini 3.5 Flash launched in May with a higher price than version 3.0, resulting in slow migration; by the end of the month, 3.0 Flash still accounted for 90% of traffic in the Flash series, while 3.5 Flash accounted for only 7%. Meanwhile, AI intelligent agents show extremely high token consumption density, with more than half of the tokens consumed by just a quarter of the requests. (Source: BlockBeats)

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

Reward
like
Comment
Repost
Share

Comment

Add a comment

No comments

Trending Topics
View More
#
MyGateTradeStory
79.26K Popularity
#
USMayCPIHits3YearHigh
314.18K Popularity
#
PredictWorldCup🇲🇽vs🇿🇦
743.32K Popularity
#
USIranConflictEscalates
711.66K Popularity
#
GateLaunchesHongKongStockTrading
674.54K Popularity

Pinned

Sitemap

Vercel: DeepSeek's token usage exceeds OpenAI, accounting for only 1% of total expenses

Trending Topics

MyGateTradeStory

USMayCPIHits3YearHigh

PredictWorldCup🇲🇽vs🇿🇦

USIranConflictEscalates

GateLaunchesHongKongStockTrading

Pinned