I've noticed while monitoring the AI industry that a strange pattern is happening. Eight years ago, a Chinese telecom company literally went out of business because of an embargo. But now, other Chinese AI companies are growing rapidly despite facing higher pressure. What has really changed?



Let's go back to 2018. ZTE was one of the largest telecommunications equipment manufacturers in the world—80,000 employees, billions in annual revenue. Then, in just one day, an order from the US Bureau of Industry and Security shut down the entire company. No American components, no Google license, no operating system. Three weeks later, ZTE announced they could no longer operate the business. They paid a $1.4 billion fine, but the real problem was in the ecosystem—they were completely dependent on a global supply chain controlled by the US.

Today, even with similar restrictions, Chinese AI companies are not suffering the same fate. Why? Because the problem isn’t just hardware. The real bottleneck is CUDA.

I mention this because most people assume that the chip ban is about the chips themselves. That’s wrong. CUDA—the parallel computing platform from NVIDIA since 2006—is the real obstacle. All major AI frameworks worldwide, from Google’s TensorFlow to Meta’s PyTorch, are deeply dependent on CUDA. When an AI researcher studies, CUDA is the first tool they learn. Every line of code strengthens NVIDIA’s ecosystem.

By 2025, there are 4.5 million developers in the CUDA ecosystem, over 3,000 GPU-accelerated applications, and 40,000 companies worldwide using it. That’s 90% of global AI developers. It’s a flywheel—once it starts, it’s almost impossible to stop. More developers, more tools. More tools, more developers join. The result? NVIDIA sets the rules, and everyone follows.

So, between 2022 and 2024, the US government implemented three waves of restrictions on NVIDIA chip exports. First A100 and H100, then A800 and H800, then H20. But these didn’t trigger the same panic as with ZTE. Why? Because Chinese companies pivoted to algorithm optimization instead of rebelling against hardware.

DeepSeek is the best example. Their V3 model has 671 billion parameters, but each inference uses only 37 billion—just 5.5% of the total. To train it, they used only 2,048 NVIDIA H800 GPUs for 58 days, costing about $5.576 million. Compare that to an estimated $78 million for GPT-4. An order of magnitude difference.

Pricing speaks even louder. DeepSeek API input costs $0.028 to $0.28 per million tokens, output $0.42. GPT-4 costs $5 for input, $15 for output. Claude Opus is even more expensive—$15 input, $75 output. DeepSeek is 25 to 75 times cheaper. This price difference triggered a massive shift in the developer market.

By February 2026, at OpenRouter—the largest AI model API aggregation platform—the weekly usage of Chinese AI models jumped 127% in three weeks and overtook the US for the first time. A year earlier, Chinese models accounted for less than 2% of the market. Now, it’s up 421% and approaching 6%. But the deeper shift isn’t just about price. Since mid-2025, the primary AI application shifted from chatting to Agents. In agent scenarios, token usage is 10 to 100 times higher than in simple chat. When token consumption explodes exponentially, price becomes the key factor. The extreme cost efficiency of Chinese models timed perfectly with this window.

But algorithm optimization isn’t just solving training problems. If you can’t train on the latest data and iterate, your model quickly becomes obsolete. Training requires massive computing power. So, where are Chinese companies getting their computing infrastructure?

There’s a small city in Jiangsu called Xinghua—known only for stainless steel and healthy food—but in 2025, they built a 148-meter server production line here. From signing the agreement to operation, it took only 180 days. The core is two fully local chips: Loongson 3C6000 processor and TaiChu Yuanqi T100 AI accelerator card. Loongson has its own design—from instruction set to microarchitecture. TaiChu Yuanqi comes from the National Supercomputing Center Wuxi and Tsinghua University, with a heterogeneous many-core architecture.

At full capacity, one server every 5 minutes. Total investment: 1.1 billion yuan, expected to produce 100,000 units annually. The key is that clusters of thousands of local chips have started handling real large model training. In January 2026, Zhipu AI released GLM-Image with Huawei—China’s first SOTA image generation model trained entirely on local chips. In February, China Telecom completed full training of their hundred-billion-level Xingchen model using local compute pools of thousands of GPUs in Shanghai Lingang.

The significance is simple: local chips have transitioned from inference-only to training-capable. This is a qualitative change. Inference only requires pre-trained models and relatively low hardware demands. Training, however, involves massive data handling, complex gradient computations, parameter updates—requiring higher computing power, interconnect bandwidth, and software ecosystems.

The driving force behind this is Huawei’s Ascend series. By the end of 2025, the Ascend ecosystem had grown to 4 million developers, 3,000+ partners, and 43 major models pre-trained on Ascend, plus over 200 open-source models adapted. On March 2, 2026, at MWC, Huawei introduced a new generation of SuperPoD compute infrastructure for overseas markets. The FP16 computing power of Ascend 910B is now comparable to NVIDIA’s A100. Gaps remain, but it’s now usable from a previously unusable state. Ecosystem building shouldn’t wait for perfect chips—wide deployment should happen while the hardware is sufficient, using real business needs to drive chip and software updates.

Deployment targets for local servers by ByteDance, Tencent, Baidu are expected to double in 2026 compared to last year. According to the Ministry of Industry and Information Technology, China’s intelligent computing scale has reached 1590 EFLOPS. 2026 is the year of widespread local computing power deployment.

But there’s another equally important side—energy. Virginia, which handles a massive share of the world’s data center traffic, has paused new data center permits. Georgia paused until 2027. Illinois and Michigan have issued restrictions. According to the International Energy Agency, US data center electricity consumption in 2024 reached 183 terawatt-hours, roughly 4% of total national consumption. By 2030, it’s expected to double to 426 TWh, possibly exceeding 12%. Arm’s CEO said that by 2030, AI data centers alone could consume 20-25% of US electricity.

The US grid is at its limits. The PJM grid covering 13 eastern states has a capacity shortfall of 6 GW. By 2033, the entire US faces a 175 GW electricity capacity shortage—equivalent to the energy use of 130 million families. Electricity prices in regions with concentrated data centers have increased by 267% compared to five years ago. The boundary of computing power is energy.

On the energy side, the gap between China and the US is larger than in chips, but in the opposite direction. China’s annual electricity generation is 10.4 trillion units versus the US’s 4.2 trillion—China is 2.5 times more. More importantly, household electricity use in China accounts for only 15% of total, compared to 36% in the US. This means China has a larger industrial electricity capacity available for building computing power.

Just the electricity price—regions with US AI companies pay $0.12 to $0.15 per kilowatt-hour, while western China’s industrial rates are around $0.03, half or one-fifth of US prices. China’s electricity generation advantage is seven times that of the US.

While America worries about power, Chinese AI is quietly developing in other countries. But this time, it’s not products or factories that are growing—it's tokens. Tokens, the smallest units of information considered by AI models, have become a new digital commodity. Produced in Chinese computing factories, shipped worldwide via undersea cables.

DeepSeek’s user distribution is clear: 30.7% from China, 13.6% from India, 6.9% from Indonesia, 4.3% from the US, 3.2% from France. Supports 37 languages, highly valued in emerging markets like Brazil. 26,000 companies worldwide may have accounts, 3,200 institutions use enterprise versions. In 2025, 58% of new AI startups integrated DeepSeek into their tech stacks. In China, DeepSeek captured 89% market share. In other trained markets, market share ranges from 40% to 60%.

This view resembles a control-loss war in the industry that happened four decades ago. Tokyo, 1986, under intense US pressure, the Japanese government signed the US-Japan Semiconductor Agreement. Three main features: open Japan’s semiconductor market, US chip market share must exceed 20%, export bans on below-cost semiconductors, 100% penalties for chips exported worth $3 billion. The US rejected Fujitsu’s acquisition of Fairchild. That year, Japan’s semiconductor industry was at its peak. By 1988, Japan controlled 51% of the global semiconductor market, the US 36.8%. Among the top 10 global semiconductor companies, six were Japanese: NEC second, Toshiba third, Hitachi fifth, Fujitsu seventh, Mitsubishi eighth, Panasonic ninth.

But after the agreement, everything changed. The US used Section 301 investigations to completely suppress Japanese semiconductor companies. Meanwhile, they supported Samsung and SK Hynix from Korea to fight Japan’s market with lower prices. Japan’s DRAM share plummeted from 80% to 10%. By 2017, Japan’s IC market share was only 7%. Former giants either split, were bought, or exited amid endless losses.

Japan’s semiconductor tragedy was in being content with the global division of labor led by external powers—being the best producer but never building an independent ecosystem. When the wave receded, they found they had nothing but manufacturing.

Today’s Chinese AI industry faces a similar but entirely different crusade. Also under major external pressure. Three waves of chip tightening, continuously strengthened, with the CUDA ecosystem barrier remaining high. The difference is they’ve chosen a harder path—from extreme algorithm optimization, through local chips from inference to training, gathering 4 million developers in the Ascend ecosystem, to spreading tokens globally. Every step builds an independent industrial ecosystem Japan never achieved.

On February 27, 2026, three local AI chip companies reported their performance. Cambrian’s revenue increased 453%, achieving full-year profitability for the first time. Moore Threads’ revenue rose 243%, but net loss was 1 billion. Muxi’s revenue increased 121%, with a net loss nearly 8 billion. Half fire, half water. The fire is market hunger. Huang’s 95% space release was matched by local companies’ revenue growth, fulfilling their targets one by one. Whatever the performance, whatever the ecosystem, the market needs a second option where NVIDIA isn’t. This is an unusual structural opportunity uncovered by geopolitics.

Building an ecosystem is expensive. Every loss is real money spent following the CUDA ecosystem—learning costs, software subsidies, engineer travel to customer sites to solve compilation issues. These losses aren’t from poor operations—they’re necessary war taxes to build an independent ecosystem. These three performance reports more truthfully depict a real hash power war than any industry report. It’s not a celebration of success, but a brutal positional battle, where soldiers rise while bleeding.

But the nature of the war has truly changed. Eight years ago, the question was “can we survive?” Now, it’s “what costs must we pay to survive?” Cost itself is progress.
View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin