Zhipu's first financial report after going public: the largest domestic revenue-generating large model company, MaaS ARR reaches 1.7 billion.

2026-04-07 16:04:19

Ask AI · How does Zhipu’s MaaS model achieve a significant improvement in gross margin?

On March 31, AI model unicorn Zhipu delivered its first annual results since going public. The latest financial report shows that during the year ended December 31, 2025, Zhipu achieved full-year revenue of over 724 million yuan, up 131.9% year over year. By comparison, from 2022 to 2024, the company’s annual revenue was 57 million yuan, 125 million yuan, and 312 million yuan respectively, showing a continuously accelerating growth trend.

In terms of revenue scale, Zhipu has already become the largest large-model company by revenue volume in China. As a reference, another Hong Kong-listed large-model company, MiniMax, had total revenue of about 79.04 million USD in 2025. In terms of business model, the paths of the two companies have also diverged. MiniMax focuses more on AI-native application products, while Zhipu mainly adopts a model and services (MaaS) approach, providing intelligent services to enterprise customers and developers through API calls.

Breaking down Zhipu’s latest financial report, one core metric worth paying close attention to is that the ARR (annual recurring revenue) of its MaaS API platform is approximately 1.7 billion yuan, up 60 times year over year. ARR is typically used to measure a company’s ability to generate persistent revenue, and it can reflect the health of the business. Under the MaaS model, this metric more directly reflects whether customers keep calling the model and steadily consuming Tokens, rather than relying on one-off projects to drive revenue growth.

Previously, the market’s perception of Zhipu was more like a project-driven large-model company. But as Zhipu’s revenue mix has shifted toward the MaaS model centered on API calls, its revenue growth logic has also changed: it no longer relies on a single project, but on continuous model usage behavior.

To some extent, Zhipu’s path inevitably brings to mind Anthropic, a leading global AI company. On one hand, it continuously strengthens base-model capabilities to raise the model ceiling. On the other hand, it uses Tokens as a core product form, driving growth through deep usage driven by the developer ecosystem and enterprise customers. Under this logic, Zhipu is gradually becoming a model company in the Chinese market that is more closely aligned with Anthropic’s development path.

Breaking the deadlock: MaaS “increasing revenue without increasing profit”****: Price increases without reducing volume—returning to the essence of business

In the market, there has long been a typical question about the MaaS model: whether it is easy to fall into a predicament of “increasing revenue without increasing profit.” The reason is that MaaS revenue is directly tied to Token consumption, and the Tokens correspond to continuously incurred compute costs. When revenue and costs are bound to the same chain, as scale expands, profit margins often face ongoing pressure from continuous compression. One misstep and it may slide into the awkward situation of “doing more and earning thinner.”

However, Zhipu’s latest financial report releases fairly positive signals. The gross margin of its MaaS API platform has increased significantly—from 3.3% in 2024 to 18.9% in 2025—showing a clear improvement in overall profitability. Combined with the data disclosed in the prior prospectus, in the past performance period from 2022 to 2024, Zhipu’s overall gross margin remained relatively stable, staying at above 50% for the long term.

Beyond optimizing model inference efficiency to the extreme and compressing Token costs to the lowest possible level, another key factor driving the increase in gross margin is the continued rise in the proportion of high-value top-tier customers.

According to the financial report, more than 4 million enterprise users and developers continue to call Zhipu’s model capabilities in real production environments, covering 218 countries and regions globally. Among the top 10 internet companies in China, 9 are using Zhipu’s GLM models. Taking the GLM-5 model as an example, within 24 hours after its release, it obtained official access for multiple top-tier platform products, such as ByteDance TRAE/Coze, Alibaba Qoder, Tencent CodeBuddy, and Meituan CatPaw.

On the other hand, these top-tier customers are more sensitive to model performance and have relatively higher tolerance for pricing. An important signal is that in February this year, Zhipu announced a structural adjustment to the pricing structure of the GLMCodingPlan package, with overall price increases starting at 30%, to ensure stability and service quality under high load. Although API prices rose by 83% in the first quarter, its usage volume actually did not decline—it increased instead. This indicates that customers truly are willing to pay for performance, not compromising due to price.

For Zhipu, this also creates a positive feedback loop brought about by price increases. Higher prices, to a certain extent, filter for high-value customers who care more about performance. Such customers often have higher retention rates and deeper usage, which further consolidates the business’s quality and ability to grow sustainably.

At a breakout session of the 2026 Zhongguancun Forum annual conference held on March 27, Zhipu CEO Zhang Peng said when discussing model price increases that as models continuously extend their thinking and reasoning paths when processing complex tasks, the number of Tokens required to complete a task can reach 10 times or even 100 times that of simple question-and-answer. Therefore, the corresponding price adjustments are, in essence, a natural result of changes in costs. Improvements in model capability also lead to increased service costs; the hope is to gradually bring these back to a normal commercial value range.

“Relying on low-price competition for the long term is actually not beneficial to the development of the entire industry, and that is one of our important considerations.” Zhang Peng further pointed out, “We hope that through this approach, we can form a healthier closed loop in our commercialization path—continuously optimizing model capability—and provide everyone with better models and corresponding Token services in a more long-term and stable way.”

At the end of the day, Zhipu’s core judgment can be summarized as: the intelligent ceiling determines pricing power, and the scale of Token consumption determines the size of value.

In other words, the stronger the model capability, the higher its substitutability gap in key scenarios, and the larger the bargaining space it can support. Only when the model enables large-scale, continuous calls and forms stable Token consumption can its commercial value truly be released on the ground. It can be seen that the commercial value of AGI is, in essence, the result of the combined effect of the intelligent ceiling and the scale of Token consumption.

**When Tokens become a new currency,**Zhipu provides a new AI value measurement framework

Looking closely at Zhipu’s first annual financial report, it is not hard to see that a positive flywheel built around MaaS is accelerating its formation. Specifically, as model capability continues to improve, it attracts more enterprises and developers to connect. The expansion of connection scale drives up Token calling volume and pushes revenue growth. Meanwhile, sustained growth in revenue in turn provides backing for model training and compute investment, further strengthening model capability. Through such a loop of escalation, a self-reinforcing growth closed loop is formed.

In this MaaS flywheel, the most critical variable is continuous improvement in model capability itself. Zhipu has proposed that the improvement of the intelligent ceiling is the only “first principle” in the era of large models and even general artificial intelligence.

Over the past year, Zhipu’s base models have completed more than 5 important iterations in total, evolving continuously from GLM-4.5 to GLM-5-Turbo. In authoritative evaluation leaderboards such as Artificial Analysis, the GLM series has ranked among the global top tier, second only to advanced models like Google Gemini, OpenAIGPT, and AnthropicClaude. At the same time, it leads many other domestic models.

A more obvious change is that the GLM series is evolving from “knowledge-oriented” to “task-oriented.” It is no longer limited to a question-and-answer style knowledge base; instead, it has the agent capability to independently complete complex tasks. GLM-5 is precisely the product of transformation under the “Agentic Engineering” trend. According to official descriptions, GLM-5 has achieved open-source SOTA performance in Coding and Agent capabilities. In real programming scenarios, the user experience is already close to Claude Opus 4.5—especially strong in complex systems engineering and long-horizon Agent tasks.

The latest GLM-5-Turbo launched earlier this month is positioned as a “base model optimized for deep scenarios of the OpenClaw lobster.” During the training stage, this model was specially optimized for the core requirements of lobster tasks, enhancing key capabilities such as tool calling, instruction following, timed and continuous tasks, and long-chain execution—effectively addressing many problems that general models face in real lobster scenarios.

Recently, the open-source Agent project OpenClaw has sparked a deployment boom domestically and internationally. However, because OpenClaw has a high barrier for local deployment and extremely high Token consumption costs, many users have chosen to switch to “one-click deployment” solutions provided by domestic cloud and model vendors. Model foundation companies such as Zhipu are among the beneficiaries of this OpenClaw wave.

On March 10, Zhipu officially launched AutoClaw (Australian Dragon), defining it as “the first locally deployed OpenClaw with true one-click installation in China.” AutoClaw encapsulates 50+ mainstream Skills and APIs, and supports one-click integration with instant communication tools such as Feishu. On OpenRouter, the world’s largest AI model API aggregation platform, the call volume of GLM-5-Turbo this week reached 966 billion Tokens, ranking among the global top ten. Zhipu has become one of the domestic vendors with the highest paid Token consumption.

Tokens are gradually becoming the “new currency” of the intelligent economic era. As the smallest unit for models to process information, the industry generally regards Token usage volume as an important indicator for measuring model activity and the actual scale of processing. NVIDIA founder and CEO Jensen Huang bluntly said that Tokens are “the new commodity.” Alibaba has also established a new business group centered on Token Hub as a core line. Meanwhile, a new generation of complex Agents represented by OpenClaw is pushing Token consumption into an exponential growth phase, signaling that a new paradigm of rapidly increasing Tokens is accelerating.

With the rapid expansion of Token consumption scale, the industry also urgently needs a new framework to measure how Tokens are converted into real value more efficiently. Against this backdrop, Zhipu proposed the concept of “Token Architecture Power” (TAC), defining it as: TAC = the quantity of intelligent calls × the quality of intelligent output × the efficiency of converting into economic value.

Specifically, “quantity” refers to the number of Tokens that enterprises and individuals call each day, as well as the task scale delivered for AI processing. “Quality” measures whether the model that Tokens rely on is smart and reliable enough to output deliverable results in complex scenarios. “Efficiency” focuses on whether the right scenarios can be found so that AI truly converts into measurable economic output.

In the long run, the core competitiveness of organizations and individuals will increasingly depend on their TAC level. Compared with a model provider that only offers large models, Zhipu’s goal is more like building TAC infrastructure for the whole society—helping various organizations and individuals efficiently schedule and use intelligent resources, and continuously convert them into economic value that is actionable and measurable.

And this development path has been preliminarily evidenced in Zhipu’s latest financial report disclosure. Looking back, the core significance of this financial report may be that Zhipu demonstrates a relatively clear and self-consistent growth logic: by continuously improving the intelligent ceiling, it gradually strengthens pricing power, thereby driving API revenue growth, and continuously optimizing the overall gross-margin structure. In doing so, the sustainability of its business model is also validated.

From the perspective of a “China-version Anthropic,” Zhipu’s valuation logic should no longer be anchored in the old framework of traditional software vendors or project-based companies. Instead, it should shift to a measurement system centered on the MaaS platform—focusing on platform penetration, Token consumption scale, and the ecosystem control power represented behind TAC. In this entirely new coordinate system, Zhipu’s long-term imagination space has just been truly opened up.

GLM-0.94%

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.