From Power Infrastructure to Token Economy: The "Seven Layers of Cake" in the AI Industry Chain

Title: From Power Infrastructure to Token Economy: The "Seven-Layer Cake" of the AI Industry Chain

Author: Rhythm BlockBeats

Source:

Repost: Mars Finance

The driving force of the AI era has shifted from models to Tokens

In the past two years, the narrative of the AI industry’s first half has mainly revolved around the "Big Model War" initiated by major tech giants. The number of parameters has grown from hundreds of billions to trillions, training costs have risen from tens of millions to hundreds of millions of dollars, and GPU clusters have expanded from thousands to tens of thousands of cards. Everyone is discussing whose model is stronger, who is closer to AGI, as if the end point of AI competition is solely based on the performance of large models themselves.

But by 2026, the driving logic of the AI industry has changed. JPMorgan’s latest report suggests that the true driver of sustained expansion in AI infrastructure will no longer be model training, but the massive demand for AI inference. In the future, the most resource-consuming activity will not be training large models, but the AI Agents distributed worldwide. Every call, every interaction, every task execution fundamentally consumes Tokens. The AI industry is transitioning from the "Model Era" to the "Token Industrial Era."

Because what truly powers the operation of the AI world in the future is not just the models themselves, but the production, distribution, scheduling, and consumption systems built around Tokens. Especially as AI Agents begin to appear on a large scale, how Tokens are generated in real-time, distributed across regions, dynamically scheduled, and efficiently consumed will become the most critical new issue in the entire AI industry.

As Huang Renxun recently proposed, AI is not just a simple software industry but a foundational infrastructure system similar to electricity and the internet. In his "Five-Layer Cake" architecture, the AI industry is divided into five layers: Energy, Chips, Infrastructure, Models, and Applications. As the AI industry gradually shifts from the "Training Era" to the "Inference Era," GoodVision AI prefers to understand the entire AI economic chain as a "Seven-Layer Cake" structure centered around Tokens:

Layer 1: Power — The Energy Foundation of the AI Era
Layer 2: AIDC — Token Factory
Layer 3: GPU — Token Production Equipment
Layer 4: LLM — Token Production Engine
Layer 5: Token Distribution — The "Power Grid" of the AI Era
Layer 6: Token Optimization and Intelligent Scheduling — The Brain of the AI Era
Layer 7: AI Agent — The Token Consumption Terminal

From energy and GPUs to AIDC, edge nodes, and then to model inference and intelligent scheduling, the AI industry is forming an unprecedented "Token Industrial System."

However, at this stage, this system is still far from mature.

Some possess the most advanced GPUs but are limited by energy; some have built large-scale AIDC but lack efficient scheduling; some have developed powerful AI Agents but face high inference costs and latency; some control edge nodes but cannot form a unified, coordinated network. Although the entire industry chain is developing rapidly, there are still many fragmentation, redundancy, and efficiency bottlenecks between layers.

Only when these seven infrastructure layers are truly interconnected, coordinated, and linked together will the AI industry move from today’s "Tool Era" into the "Large-Scale Adoption Era" belonging to the intelligent world.

Layer 1: Power — The Energy Foundation of the AI Era

The industrial revolution fought over coal and oil; the internet era fought over traffic and servers; but in the AI era, the fundamental battle is returning to energy.

Because AI ultimately consumes electricity. The power consumption of a large AI data center is already comparable to that of a medium-sized city. New AI data centers worldwide face the same problem: GPUs can be purchased, land can be built, but power supply cannot keep up, and grid dispatching is insufficient.

This is why more and more AI companies are beginning to refocus on energy infrastructure. At GTC 2026, Huang Renxun even defined future data centers as "Token Factories." The top of this factory will spawn a super energy industry.

In the Chinese market, companies like Yangtze River Power, China Nuclear Power, China General Nuclear, Three Gorges Energy, Longyuan Power, and Huadian New Energy represent core energy sectors such as hydropower, nuclear, wind, and solar. Among them, nuclear and hydropower, with their stable power supply, are becoming the most important foundational energy sources for AIDC; while wind and solar benefit from the increasing demand for green electricity and ESG in the AI industry. With the advancement of "East Data West Computing" and large AI data center construction, the synergy between renewable energy bases and computing centers is rapidly strengthening.

In the US market, traditional energy giants like NextEra Energy, Dominion Energy, Duke Energy, Southern Co., and Exelon are also benefiting from AI data center expansion. NextEra is North America’s leader in green power; Dominion controls key transmission resources in the North Virginia "Data Center Corridor"; Exelon, with its stable nuclear power supply, is a major beneficiary of the "all-weather, high-stability electricity" demand in the AI era. Overall, the global power industry is gradually upgrading from traditional utilities to a core resource layer of AI infrastructure.

The overall pattern of competition in this layer is shifting from traditional energy companies competing on "electricity prices" to downstream AI data centers, cloud providers, and energy companies competing for "electricity locking rights." Whoever can secure long-term, stable, low-cost energy will hold the first Dragon Ball of Token production.

Layer 2: AIDC — Token Raw Material Factory

A single GPU is meaningless; what matters is large-scale clusters. This led to the emergence of AIDC.

It resembles industrial-era steel mills, power plants, and assembly lines, concentrating thousands of GPUs to form stable Token production capacity. But problems also began to surface: traditional AIDC construction cycles often take 18 to 36 months, and grid expansion can take even longer. When AI demand grows exponentially, the speed of traditional IDC construction can no longer meet the new Token economy needs.

In the US stock market, Equinix is one of the world's leading data center operators, with over 240 data centers across more than 30 countries. Its core advantage is not just the number of data centers but its global interconnection capability and low-latency network resources, making it an important infrastructure node for AI compute deployment.

Digital Realty, through its PlatformDIGITAL, is entering the AI infrastructure space, serving large cloud providers and financial institutions.

In China, Runze Technology is one of the most typical AIDC operators listed on the A-share market. Its main business has gradually upgraded from traditional IDC to AI compute centers, with core competitiveness in large-scale data centers, power resources, and AIDC operations. Companies like AoFei Data and Capital Online are expanding in regional data centers, cloud infrastructure, and AI compute hosting. Sugon (Inspur) focuses on government, enterprise, and scientific research collaborations in AIDC.

Another category of players comes from "mining farm transformation." Companies like CoreWeave, IREN, Applied Digital, and Cipher Mining, originally involved in cryptocurrency mining, have rapidly shifted toward AI compute infrastructure as AI GPU demand surges. IREN emphasizes "green power + AI compute" by building high-density GPU data centers with renewable energy. Applied Digital and Cipher Mining are also transitioning from traditional mining farms to high-performance AI computing infrastructure.

Additionally, a new trend is emerging: edge, small-scale, modular AI factories. Just as the internet era moved from mainframes to cloud computing, AI compute is gradually spreading from ultra-large centers to regional edge nodes.

Therefore, GoodVision AI has chosen a different path: building more lightweight, modular, and quickly replicable AI factories. Compared to traditional large-scale AIDC, GoodVision AI emphasizes regional deployment, high-density GPU cluster efficiency, and integrated energy and compute coordination.

Its core logic is not to build a single mega data center but to rapidly deploy AI factory nodes in high-density population regions worldwide, typically small 2-4MW inference compute rooms. This model can access local energy resources more quickly and aligns with the future trend of AI inference expanding to the edge.

If traditional AIDC is like an industrial-era steel mill, then GoodVision AI’s approach is more like an "Regional Token Factory" of the AI era—lighter, more flexible, closer to users, and better suited for the future development of distributed inference networks globally.

Layer 3: GPU — The Production Equipment of Tokens

If power is energy, then GPUs are the production equipment. In the initial years of AI explosion, GPUs mainly served training; but in the future, the larger demand will come from inference. Because training is limited to a few leading companies, while inference will permeate every application, device, and terminal. Robots need inference, autonomous driving needs inference, AI glasses need inference, and even future collaboration among AI Agents will constantly consume Tokens in real-time.

NVIDIA remains the absolute core of the global AI chip industry. Its GPU products like H100, B200, and Blackwell almost define the current global standards for AI training and inference. More importantly, NVIDIA not only sells chips but also builds a complete ecosystem through CUDA, TensorRT, DGX, HGX, and other hardware and software systems. Therefore, its competitors must challenge not only GPU performance but also the entire AI software ecosystem.

AMD is currently the main challenger, with core products including MI300X and other AI GPUs. Compared to NVIDIA, AMD emphasizes open ecosystems and the ROCm software platform, aiming to attract AI developers and enterprise clients through a more open approach.

Broadcom and Marvell represent another route—ASICs and high-speed interconnects. As AI inference scenarios become more complex, more companies are attempting to customize ASIC chips for higher efficiency and lower costs.

Intel is entering the AI market through server CPUs and Gaudi AI accelerators, hoping to leverage its CPU ecosystem to re-engage in AI infrastructure competition.

In China, Cambricon is one of the most representative domestic AI chip companies, promoting the SiYuan series AI chips and developing its own AI framework Neuware. Horizon Robotics holds AMD Zen architecture licenses and focuses on DCU and inference markets.

Other domestic GPU companies like Moore Threads, Suiyuan Technology, Muoxi, and Bairun Technology represent China’s "domestic substitution" in AI chips. They generally emphasize compatibility with CUDA ecosystems and are building domestic GPU clusters.

From CUDA ecosystem to HBM memory and Tensor Cores, the core of the entire AI industry is continuously improving the "Token generation efficiency per unit time." Meanwhile, GPUs and their supporting infrastructure—servers, optical modules, liquid cooling, switches—are closely related to Token production efficiency.

These components may not be as flashy as NVIDIA or OpenAI, but they determine whether the entire AI world can operate smoothly. Just as the industrial revolution required not only steam engines but also railways, power grids, and ports, the AI revolution is not just a software revolution but a global industrial upgrade covering energy, chips, networks, cloud computing, and infrastructure.

Vertiv is a global leader in data center UPS and power management, providing data center power supply, cabinet distribution, and precision cooling systems.

InVex is a leading A-share liquid cooling and temperature control system provider, serving clients including BAT and other large internet companies. As GPU power increases, liquid cooling is becoming an essential standard for AIDC.

Zhongheng Electric, Kehua Data, and Kstar are important players in UPS, power systems, and IDC power supply fields.

In network and optical modules, companies like Zijing Xuchuang, NewYisheng, and Tianfu Communications benefit from the surge in high-speed internal communication demands within AI clusters.

In server systems, Dell, HPE, Supermicro, Lenovo, and Inspur are responsible for large-scale AI server assembly and delivery.

Although this layer does not directly face end users, it determines whether AI infrastructure can operate stably. Liquid cooling, UPS, optical modules, switches, energy storage, and server systems are like the railways, power grids, and ports of the industrial age—becoming the real "selling points" in the AI world.

Layer 4: LLM — The Token Production Engine

Large Language Models (LLMs) determine how Tokens are understood, generated, and organized. Over the past two years, companies like OpenAI, Anthropic, Google, Meta, xAI, DeepSeek have launched a global "Big Model Race." Parameters have grown from hundreds of billions to trillions, and model capabilities have expanded from text generation to multimodal, reasoning, coding, Agent collaboration, and long-term memory.

But as the industry develops, the market is beginning to realize: the truly important factor in the future is no longer "who has the biggest model," but "who can run models continuously at lower cost and higher efficiency." Because models themselves do not directly create value; the real value is generated through the inference process after models are called repeatedly.

This also means that LLMs are evolving from "demonstrating model capabilities" to becoming the "Token Production Engines" of the AI world.

OpenAI, Anthropic, Google Gemini, Meta Llama, and other closed-source and open-source models are competing for the future AI ecosystem entry points; while emerging players like DeepSeek are reshaping industry competition through lower costs and higher inference efficiency. Now, the competition at the LLM layer is gradually shifting away from parameter count battles toward multi-dimensional comparisons:

Token cost, inference efficiency, context capacity, multi-agent collaboration, long-term memory, model and infrastructure synergy

Because what truly matters in the AI era is not just whether a large model is "smart," but whether it can be continuously, massively, and cost-effectively operated worldwide. GoodVision AI has its own optimization plans at this layer: by collaborating with large model providers to deploy models in AI Factory data centers, transitioning from traditional compute leasing to direct Token services; this not only improves gross margins but also enhances user experience.

Layer 5: Token Distribution — The "Power Grid" of the AI Era

Once the AIDC is built, the next question arises: how can these compute resources be used worldwide?

Thus, compute leasing platforms emerge. They resemble the "power grid" system of the AI era, splitting and distributing the originally centralized GPU resources, then renting them out on demand to developers, enterprises, and AI applications.

AWS, Azure, Google Cloud, Alibaba Cloud, Tencent Cloud remain the most powerful players at this layer. They possess the largest global cloud infrastructure and are gradually integrating AI GPU resources into their IaaS systems.

Meanwhile, a wave of "AI-native clouds" is rising rapidly. Companies like CoreWeave, Nebius, Nscale are building GPU cloud platforms specifically for AI training and inference needs. Compared to traditional cloud providers, they are more flexible, focused on AI tasks, and better at GPU cluster optimization.

CoreWeave is one of the most representative companies in NeoCloud. Originally focused on Ethereum mining, it has fully transitioned to AI GPU cloud services and is now a key AI infrastructure partner supported by NVIDIA.

Lightweight cloud platforms like DigitalOcean and Vultr target small and medium developers and startups, emphasizing rapid deployment and low-cost GPU services.

In China, besides the giants, companies like UCloud, Kingsoft Cloud, and Capital Online are main suppliers in the GPU cloud and AI compute leasing markets. The competition pattern here is very similar to early power grids: how to efficiently distribute dispersed compute resources.

Layer 6: Token Optimization and Intelligent Scheduling — The Brain of the AI Era

This may be the most underestimated yet most critical "cake" layer. As AI Agent usage explodes, it becomes clear that not all tasks are worth calling the most expensive large models. Many simple tasks can be handled locally; many real-time tasks are better suited for edge inference; some privacy-sensitive tasks cannot even be uploaded to the cloud. Beyond "whether there is compute," a new question arises: "how to use compute more intelligently."

With exponential growth in Token demand, "matching the right model with the right compute for the right task" is key to making Tokens used reasonably and efficiently. This is one of the directions GoodVision AI is working on, beyond building AI Token factories.

Just like today’s power systems: some demands come from the main grid; others from rooftop solar. The truly important part is the "intelligent scheduling system" in the middle.

Future AI will have a similar structure: simple tasks handled locally by small models, complex tasks calling large cloud models, high-privacy tasks processed at the edge, and high-concurrency tasks dynamically scheduled via hybrid cloud.

Besides GoodVision AI, companies like QingCloud, Lambda, OpenRouter, and Fireworks AI are also leaders in Token optimization and intelligent scheduling.

This "cake" layer overlaps heavily with the previous layers—AIDC and compute leasing. As GPU resources, regional nodes, and inference tasks scale up, simply "owning compute" is no longer enough to establish long-term barriers. More and more AIDC operators and GPU cloud platforms realize that the real determinants of efficiency and profitability are not just GPU counts but how to dynamically schedule models, compute, and Token traffic.

Therefore, many platforms originally focused on AIDC and GPU cloud are now extending into "intelligent scheduling." For example, in China, companies like UCloud, Capital Online, and Sugon are trying to combine their GPU cloud facilities, multi-cloud resources, and inference scheduling capabilities, gradually shifting from "selling compute" to "optimizing compute."

Layer 7: Models and Agents — Token Consumers

This layer, though closest to users and most accessible for traffic, is also the most fiercely competitive. At GTC 2026, Huang Renxun stated: "In the future, every company will become a 'Token producer and consumer.'"

An AI Agent may call multiple models, tools, and APIs simultaneously, continuously reasoning, planning, and executing. This means that future AI Token consumption will far surpass today’s human-AI dialogues. Some heavy AI users already build multi-Agent systems with concurrent calls and inter-communication, consuming 1 billion Tokens daily without issue.

The future is not just 1 billion people using AI, but 10 billion, even 100 billion AI Agents working simultaneously, calling each other. The bottleneck will shift from "model capability" to "Token scheduling efficiency."

Tech giants like Microsoft, Google, Meta, and Amazon are embedding AI capabilities into all their products—office suites, search, social networks, and cloud services.

Enterprise software companies like Adobe, Salesforce, ServiceNow, and Palantir are rapidly advancing in enterprise AI Agents and automation workflows. Meanwhile, Hugging Face is becoming the "GitHub" of the AI era—more than just a model community, it’s a crucial infrastructure for the global AI development ecosystem.

In China, companies like iFlytek, Kunlun Wanshi, 360, Kingsoft Office, and SenseTime are actively developing AI assistants, AI office tools, and AI Agents.

Once the "Seven-Layer Cake" is fully formed, the AI world will truly begin.

Today’s AI industry is still part of an infrastructure system that is not yet fully mature.

Some have the most advanced GPUs but are limited by energy; some have built large AIDC but lack efficient scheduling; some have developed powerful models and Agents but face high inference costs and latency; some control edge nodes but cannot form a unified, coordinated network.

From power, AIDC, GPUs, to LLMs, Token distribution, intelligent scheduling, and AI Agents, the entire AI industry chain is developing rapidly, yet there are still many fragmentation, redundancy, and efficiency bottlenecks.

Only when this "Seven-Layer Cake" is fully constructed and begins to operate efficiently in coordination will the AI industry truly move from the "Tool Era" into the "Large-Scale Adoption Era" of intelligent infrastructure.

The future AI world will no longer be just a few tech giants training large models but billions of AI Agents continuously online, collaborating, and calling compute and Tokens. Every conversation, inference, tool call, and automated task will be backed by energy, GPUs, networks, scheduling systems, and inference nodes working in harmony.

This also means that the AI industry is evolving from a "software logic" into a super-industrial system covering energy, chips, cloud computing, edge networks, and intelligent scheduling.

Just as the industrial revolution required not only steam engines but also railways, power grids, and ports; the internet revolution needed not only PCs but also fiber optics, data centers, and cloud computing. The hallmark of a mature AI revolution will not be just a blockbuster application but the formation of a global "intelligent infrastructure network" capable of continuously producing, distributing, scheduling, and consuming Tokens.

When this seven-layer infrastructure is finally connected, the competitive logic of the AI industry will be fundamentally reshaped. The most important companies in the future may no longer be those with the largest models but those capable of connecting energy, compute, networks, models, and Token flows.

TOKEN-3.47%
SO1.67%
CRWV-4.92%
IREN-3.85%
View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • 12
  • 1
  • Share
Comment
Add a comment
Add a comment
MildlyRugged
· 05-28 10:24
Electricity → Computing Power → Tokens, each layer's extraction is quite intense.
View OriginalReply0
GateUser-1c5ab2b5
· 05-26 12:46
Do you have the link to JPMorgan's report? I want to see how they specifically calculated the figures.
View OriginalReply0
PositionLikeACat
· 05-26 06:57
So, is the ultimate winner the country with cheap electricity?
View OriginalReply0
ShellsLeftBehindByTheReceding
· 05-26 06:06
From selling shovels to selling tokens, the business model has completely changed.
View OriginalReply0
GateUser-04e4dac2
· 05-26 05:35
After the big model war is over, the token war begins
View OriginalReply0
FlowingColorfulInkHeart
· 05-26 05:33
If the token economy collapses, the valuation of these AI companies should be reassessed.
View OriginalReply0
PaperSculptureOctopusPosition
· 05-26 05:32
This narrative has shifted so quickly; last year it was all about AGI, and this year they're talking about economic models.
View OriginalReply0
SudoSatoshi
· 05-26 05:29
In the seven layers, it feels like the application layer is the most competitive, while the infrastructure layer actually has barriers.
View OriginalReply0
WatercolorInAGlassBottle
· 05-26 05:25
The term "seven-layer cake" is quite vivid; from the power layer to the application layer, each layer has arbitrage opportunities.
View OriginalReply0
FlamingoFrontView
· 05-26 05:22
What percentage of the electricity cost is? Is this cake base stable enough?
View OriginalReply0
View More