2026 Zhongguancun Forum | How China Defines Global Division of Labor in the AI Era with the "World Token Factory"

MaticHoleFiller · 2026-03-27T12:42:02+00:00

At the 2026 Zhongguancun Forum AI Open Source Frontier Forum, experts discussed the challenges and opportunities as the AI industry transitions from the "Training Era" to the "Inference Era," emphasizing that the surge in Token demand has led to a computing power bottleneck. Participants proposed the need to rebuild infrastructure, develop intelligent Token factories to meet explosive demand, and explored the possibility of establishing a "Chinese-style Token Economics."

MaticHoleFiller

2026-03-27 12:42:02

Abstract generation in progress

Source: Global Times

[Global Times Technology Report, Reporter Lin Di] When “OpenClaw,” this “lobster,” went viral on social networks, and people were immersed in the novelty of conversing with AI, entrepreneurs standing at the forefront of the industry saw a different picture: behind every interaction on the screen lies the burning of massive amounts of tokens; every agent’s birth triggers an exponential explosion in computing power demand.

At the 2026 Zhongguancun Forum Annual Meeting - AI Open Source Frontier Forum’s roundtable session, industry representatives and experts from the model, chip, and application layers jointly analyzed the pains and opportunities as the AI industry transitions from the “training era” to the “inference era” through the phenomenon of OpenClaw. They unanimously agreed that the pressure for a new round of growth has arrived on the supply side, and a reconstruction of token economics is underway.

The transformation from “chat toy” to “productivity tool”

The guests at the conference first focused on the recent viral phenomenon of “OpenClaw.”

Zhipu AI CEO Zhang Peng likened it to a “scaffolding.” “It provides a possibility, building a solid, convenient, yet flexible scaffolding based on the model.” Zhang believes that OpenClaw’s biggest breakthrough is breaking down technical barriers, allowing ordinary people to use top model capabilities without needing to understand code.

Xiaomi’s MiMo large model head Luo Fuli analyzed it from a technical architecture perspective, considering OpenClaw a “revolutionary event” in the agent framework. “It guarantees the lower limit of the base large model while stretching its upper limit.” She pointed out that the deep participation of the open-source community has enabled task completion levels that originally could only be achieved by closed-source models to be realized within the open-source framework.

However, Wukong AI CEO Xia Lixue shared a more shocking data point, revealing the macro trend behind this phenomenon: “Since the end of January this year, the token call volume on the Wukong AI platform has doubled every two weeks, and it has now increased tenfold.” He compared this growth rate to the proliferation of mobile data during the 3G era, stating, “The current token usage is like the early stage when everyone had only 100MB of monthly data, marking the starting point of the industry’s explosion.”

When “tenfold growth” meets “computing power bottleneck”

The explosive demand for tokens quickly pointed the contradiction towards the supply side. As AI shifts from “conversing” to “working,” the industry is facing a challenging leap from the “training era” to the “inference era.”

The “price increase” strategy adopted by Zhipu AI when releasing the GLM-5-Turbo model sparked intense discussions on cost and value at the scene. Zhang Peng admitted that getting the model to “work” versus merely “chatting” involves vastly different resource consumption: “Making a smart model perform such complex tasks consumes a tremendous amount of tokens—completing one task may require ten or even a hundred times the tokens needed to answer a simple question.”

This view was echoed by Wukong AI’s Xia Lixue. As an infrastructure provider, he deeply feels the pressure of resource scarcity: “Currently, computing power resources are far from meeting demand. We must optimize and integrate resources to enable the entire society to utilize equitable AI capabilities.”

Xia Lixue further stated that traditional cloud computing infrastructure is designed for human engineers, not for AI. “Adapting agent logic using human operating systems will limit AI’s capabilities.” He illustrated that agents can initiate tasks in milliseconds, while existing underlying capabilities like K8s are often designed for minute-level human tasks, and this “generation gap” leads to system lag and inefficiency.

Building an “intelligent token factory”

In response to the unique demands of the agent era, AI infrastructure (Infra) must undergo reconstruction. Xia Lixue proposed a path of evolution from “standardized token factories” to “intelligent token factories.”

According to reports, Wukong AI is integrating a dozen types of domestic chips and dozens of computing power clusters to achieve software and hardware synergy. “We need to use computing power resources where they matter most, maximizing resource utilization to ensure that every computing unit achieves the highest conversion efficiency,” emphasized Xia Lixue.

However, this is just the first step. Xia Lixue envisions a more long-term future: “The infrastructure of the future will itself be an intelligent entity capable of self-evolution and autonomous iteration, even incorporating agents to act as ‘managers, CEOs,’ automatically optimizing the system based on AI demands.” He refers to this as “Agentic Infra,” a “chemical reaction” where algorithms and infrastructure deeply collaborate.

Luo Fuli also explored how to support this explosive demand from the perspective of model structure innovation. She emphasized the importance of the “Efficient Long Context” architecture. “How to ensure low inference costs and high speed in contexts of ten megabytes or even a hundred megabytes is essential for genuinely high productivity tasks to be assigned to this model.” She pointed out that only with a significant improvement in computing power efficiency can the model’s “self-iteration” and “self-evolution” have a physical basis.

Beyond technical discussions, the guests also explored China’s approach from an economic perspective in the AI era.

Xia Lixue proposed the idea of constructing a “China-style token economics.” He focused on the deeper economic linkages: “The development of the AI industry must also emphasize sustainability, with the core being to connect the complete economic chain, transforming energy into computing power, then converting computing power into tokens, and ultimately translating into GDP to form a virtuous cycle.”

He elaborated on this grand vision: “Transforming China’s energy and cost advantages through efficient token factories into high-quality AI services.” He stated, “Let China become the world’s token factory and contribute the Chinese solution to the global development of artificial intelligence.”

Finally, Dr. Huang Chao, assistant professor and doctoral supervisor at the University of Hong Kong and head of the Nanobot team, believed that future software will no longer simply target human GUI (graphical user interface) but will be oriented towards an agent-native CLI (command line interface). “The entire ecosystem needs to come together to co-build and transform software systems and data into an agent-native model.”

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.