Analysis: The gap between Chinese and American open-source large models and the closed-source frontier is only 3 to 6 months, and ultra-low costs are accelerating the global wave of cost-effective alternatives.

robot
Abstract generation in progress

According to Beating monitoring, aggregate service provider OpenRouter has disclosed that the performance gap between open-source models and closed-source frontier models has remained stable at 3 to 6 months. Over the past 18 months, frontier closed-source labs have failed to pull further ahead as expected, while open-source forces represented by new players from China and the U.S. are accelerating the replacement of closed-source models with exceptionally high cost performance.

DeepSeek V4 Flash, released just two months ago, quickly became the preferred choice for replacement. With 284 billion parameters, DeepSeek V4 Flash achieved a 79.0% score on the SWE-bench Verified evaluation, with performance nearing the GPT-5.5 level. The official first-party input/output pricing is only $0.14/$0.28 per million tokens, making the output cost about 150 times cheaper than GPT-5.5. Even after factoring in the Western cloud hosting premium that does not retain data for training, the actual cost is only around 1.3% of closed-source frontier models.

Beyond its price advantage, GLM 5.2 released by Zhipu in June 2026 ranks first in the Artificial Analysis open-source weight intelligence index, and in real-world agent evaluations it reaches the GPT-5.5 level, becoming a replacement option for long-range programming planning. However, GLM 5.2 is relatively token-intensive during deep thinking, so enterprises need to balance output costs when deploying. The multimodal open-source model MiniMax M3, meanwhile, leverages an innovative MSA sparse attention architecture to deliver native long-context processing capabilities for images and video at a lower token price, making it a strong open-source contender against Gemini Flash.

At the same time, NVIDIA’s Nemotron 3 Ultra, built on the Mamba-2 hybrid architecture, has become the strongest U.S. domestic open-source force, aiming to drive market demand for NVIDIA hardware and microservices ecosystems through an open ecosystem.

OpenRouter emphasizes that while frontier closed-source models will continue to move forward, token costs at a fixed level of intelligence will keep trending downward, providing enterprises with significant opportunities for cost optimization.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments