China’s AI large-model token calls lead the world! The computing power industry chain is witnessing a major explosion.

CryptocurrencySniper · 2026-04-08T11:37:02+00:00

On April 8th, computing power concept stocks performed strongly, with multiple companies hitting the daily limit. This was mainly driven by the growth in Chinese large model Token calls and application promotion. Zhipu released the new open-source model GLM-5.1; DeepSeek updated and launched the expert mode. Additionally, domestic large model Token call volume continues to lead globally, and the demand for computing power is expected to reach a turning point.

CryptocurrencySniper

2026-04-08 11:37:02

Abstract generation in progress

The computing power concept showed strong performance on April 8. Kepler Co., Ltd., Pingzhi Information, Weida Technology, Zhejiang Wenlian, Data Harbor, Cayman Shares, Litong Electronics, Runian Shares, Mingpu Optics & Magnetics, Fu’an Shares, Auride, Huafu Fashion, Lotus Holding, and Heli Tai all hit the daily limit; 19 stocks including WASU Technology, UCloud, Sinovation Ventures, Zhongji Aochuang, Kunlun Tech, and others surged by more than 10%.

From an overall market perspective, the strength of the computing power concept is driven, on the one hand, by the outbreak of China’s Agent applications and the multimodal ecosystem, and on the other hand, by China’s leading position globally in large-model Token calls.

On the news front, Zhipu officially released its next-generation open-source model GLM-5.1 on April 8. According to the introduction, it is the only open-source model that reaches an 8-hour level of continuous work. In the SWE-bench Pro benchmark test, which is the closest to real software development, GLM-5.1 achieved a first for domestic models by surpassing Anthropic’s Claude Opus 4.6.

In addition, DeepSeek has also rolled out an important update—launching Expert Mode. As reported by First Finance, this is the first time DeepSeek has introduced mode-layered design on the product side since it became popular. The Fast Mode is suitable for everyday conversations, with instant responses, and supports text recognition in images and files. Expert Mode is good at complex problems, supports deep thinking and intelligent search; currently, it does not support file uploads and multimodal functions. DeepSeek also reminds users that in this mode, if a peak occurs, they need to wait.

With the popularization of AI large models, the Token calling volume of domestic large models is also leading globally. Data from the OpenRouter platform shows that in the past week (March 30 to April 5), China’s AI large models recorded 12.96 trillion Tokens of weekly calls, up 31.48% from the previous week. In contrast, the United States had only 3.03 trillion Tokens, with a quarter-over-quarter increase of just 0.76%, and the gap continues to widen. Specifically, among the top 6 models by Token call volume, all come from China; Qwen3.6 Plus ranks first with 4.6 trillion Tokens.

China International Capital Corporation (CICC) stated that it is precisely the development of large models that has brought about a leap in computing power demand, which can be divided into three stages:

First stage: Chatbot—a one-question, one-answer format, with short context and limited Token consumption per turn.

Second stage: low Agent—expanding capabilities to tool calling, including searching web pages, executing code, querying databases, calling external APIs, and so on, rather than only generating text. Loading and calling tools increases context, and Token consumption is significantly higher than in pure Chatbot scenarios. According to Anthropic’s measured data, the Token consumption of a single Agent is about 4 times that of a pure Chatbot.

Third stage: mid-tier Agent—this is the stage that AI is entering right now, and it is also the core driving force behind the qualitative change in computing power demand. In the Prefill stage of a mid-tier Agent, a large amount of tool definitions, system prompt words, and intermediate results need to be loaded during execution, causing the context length to continuously expand throughout task execution. Taking Manus as an example, the average ratio of input Tokens to output Tokens is about 100:1. Meanwhile, Anthropic’s measured data shows that in a multi-Agent system, Token consumption is about 15 times that of a conversational mode.

Shanghai Securities stated that the ongoing iteration of domestic large models, along with Token economics igniting computing power demand, may bring an inflection point to China’s domestic computing power industry chain.

As for individual stocks, in the computing power industry chain, computing power chips (Cambricon, Higon Information); optical communications (Zhongji Aochuang, Xin Yi Sheng, Tianfu Communications, etc.); AIDC suppliers (Baoxin Software, Runze Technology, Huanwang New Network, etc.); and liquid cooling (Yingweike, Shenling Environment, WASU Technology, etc.) have drawn relatively more attention from the market.

（Source: Oriental Fortune Research Center）

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.