Futures
Access hundreds of perpetual contracts
TradFi
Gold
One platform for global traditional assets
Options
Hot
Trade European-style vanilla options
Unified Account
Maximize your capital efficiency
Demo Trading
Introduction to Futures Trading
Learn the basics of futures trading
Futures Events
Join events to earn rewards
Demo Trading
Use virtual funds to practice risk-free trading
Launch
CandyDrop
Collect candies to earn airdrops
Launchpool
Quick staking, earn potential new tokens
HODLer Airdrop
Hold GT and get massive airdrops for free
Pre-IPOs
Unlock full access to global stock IPOs
Alpha Points
Trade on-chain assets and earn airdrops
Futures Points
Earn futures points and claim airdrop rewards
NVIDIA has started selling the tools to make shovels.
Author: Ada, Deep Tide TechFlow
San Francisco, San Jose Convention Center, GTC on-site.
NVIDIA Chief Scientist Bill Dally sat on stage, facing Google’s Jeff Dean. During their conversation, Dally threw out a number: “Previously, porting a standard cell library containing about 2,500 to 3,000 cells took a team of 8 engineers about 10 months.”
He paused.
“Now, it only takes a single GPU card, running overnight.”
There was no gasp from the audience because everyone who understood this sentence knew what it implied. Eight engineers’ 10 months of work was eaten overnight by a GPU made by their own company. Dally also added: the results match or even surpass human design in terms of area, power consumption, and latency.
The next day, news interpreted this as “NVIDIA uses AI to design GPUs.”
But the truth of this matter is far more intriguing than the headlines suggest.
What is NVIDIA internally running?
NVIDIA isn’t running a black box; it’s several toolchains refined over years.
NB-Cell is a reinforcement learning-based program dedicated to the arduous task of migrating standard cell libraries. Prefix RL aims to solve the long-standing research challenge of placement during the pre-carry lookahead chain. Dally said that the layout generated by this system “is something humans could never conceive,” and compared to human design, key metrics improved by about 20% to 30%.
There are also two internal large language models (LLMs), Chip Nemo and Bug Nemo. NVIDIA feeds these models RTL code, architecture documents, and design specifications of every GPU over the past twenty years. According to Dally, this is equivalent to distilling NVIDIA’s muscle memory from G80 to Blackwell into an internal model, so newcomers can directly interface with twenty years of expertise.
So, has “AI designed GPUs” become a reality?
Quite the opposite. Dally’s exact words were: “I really hope one day I can just say ‘Design me a new GPU,’ but we’re still far from that.”
NVIDIA hasn’t used AI to design GPUs. But what it’s doing in another area will make the entire industry unable to operate without it in the future.
Investing $2 billion into the EDA landscape
On December 1, 2025, NVIDIA invested $2 billion in Synopsys, one of the three giants in EDA. The two parties signed a joint development agreement to embed NVIDIA’s accelerated computing stack into Synopsys’s entire EDA workflow, with Blackwell and the next-generation Rubin GPU deeply integrated with Synopsys.ai.
To clarify Synopsys’s position: almost every advanced process chip in the world—Apple’s M series, AMD’s MI series, Google TPU—relies on Synopsys or Cadence tools during design. These two, along with Siemens EDA, monopolize the foundational tools for chip design. You can skip Qualcomm chips or TSMC’s manufacturing, but you can’t bypass these three companies’ software.
Within three months of investing in Synopsys, NVIDIA also brought Cadence, Siemens, and Dassault into its fold, announcing that they are all developing AI-driven chip design tools based on NVIDIA GPUs.
NVIDIA’s benchmark data is quite startling: Synopsys PrimeSim runs nearly 30 times faster on Blackwell, Proteus 20 times faster, and Sentaurus on B200 is 12 times faster compared to CPU acceleration. MediaTek used H100 to speed up Cadence Spectre by 6 times. Astera Labs used Synopsys + NVIDIA to accelerate chip verification by 3.5 times.
A detail worth highlighting: Cadence’s Millennium M2000 platform, marketed as “specifically built for the EDA market, exclusively based on NVIDIA Blackwell.”
The word “exclusive” is particularly noteworthy. It means that previously, EDA tools could run on CPUs from Intel or AMD. In the future, the fastest EDA tools will only be available on NVIDIA’s hardware.
The true shape of the flywheel
Most people understand NVIDIA’s flywheel as: selling GPUs to AI companies, training large models, proving that GPUs are irreplaceable, leading to more GPU sales.
This flywheel is already formidable. But there’s another layer beneath it.
NVIDIA uses its own tools to design the next-generation GPUs, vastly improving design efficiency across generations, while tying the entire industry’s EDA toolchain to its hardware. Competitors want to catch up, but even the tools they chase must be rented from NVIDIA’s ecosystem.
Behind AMD’s disappointing earnings report lies this anxiety. Even though NVIDIA and Synopsys publicly state that “investment does not entail any obligation to purchase NVIDIA hardware,” the market is well aware: the initial release of accelerated EDA features is always on NVIDIA hardware, leaving AMD and Intel reliant on a “tuned for the largest competitor’s platform” pathway.
Imagine AMD engineers wanting to design a chip comparable to Blackwell in the future. If they open Synopsys tools that run fastest on NVIDIA GPUs, they face a choice: endure twice the design cycle or buy a bunch of NVIDIA cards to beat NVIDIA’s chips.
The shovels are still being sold. But the way they’re sold has changed.
The real situation of domestic GPUs
At this point, some sobering numbers are necessary.
In the same year NVIDIA’s net profit surpasses $70 billion in fiscal 2025, China’s “Four Little Dragons” of domestic GPUs—Moore Threads, Muoxi, Biren, and Suiyuan—are lining up at the IPO window.
Moore Threads’ prospectus shows that from 2022 to 2024, it accumulated a net loss of 5 billion yuan, with a further loss of 271 million yuan in the first half of 2025, totaling an unrecouped loss of 1.48B yuan as of June 30. Management estimates that it will not achieve consolidated profitability until at least 2027. Muoxi is slightly better, with a total loss exceeding 3 billion yuan over three years. The worst is Biren, which lost over 6.3 billion yuan in three and a half years, with revenue only 58.9 million yuan in the first half of 2025—less than 1/120 of Moore Threads’ 702 million yuan in the same period.
Looking at R&D investment intensity: Moore Threads’ R&D expenses in 2022 accounted for 2422.51% of revenue, still as high as 309.88% in 2024. The money spent on R&D each year is more than three times the revenue. This isn’t business operation; it’s life support, relying on primary markets and recently opened Sci-Tech Innovation Board to keep bleeding.
Tooling is even more chokepoint. BGI Nine Days’ IPO prospectus in 2022 shows their tools only partially support 5nm advanced processes. Leadeon Electronics can cover 7nm/5nm/3nm nodes but only makes some tools, far from a full process.
BGI Nine Days founder Liu Weiping candidly said: “Domestic EDA support for advanced processes still has obvious shortcomings, especially for 7nm, 5nm, 3nm. Currently, domestic EDA can only reach the level of 14nm. Although we have mastered 7nm process technology, deep integration of 7nm with actual applications still requires full industry chain collaboration.”
In other words, domestic EDA cannot support the full process of advanced nodes. Domestic GPU companies still rely on Synopsys and Cadence for chip design. In 2025, Trump once announced export controls on all critical software—though not practically implemented—making EDA tools below 7nm still strictly regulated. When licenses are cut or switched, it’s in others’ hands.
Market reactions are already quite surreal. On the day Muoxi went public, its stock closed at 829.9 yuan, up 692.95% in a single day. Moore Threads’ stock once rose to the third highest in A-shares, only behind Kweichow Moutai and Cambrian. Media estimated its market cap at about 359.5 billion yuan based on that stock price.
The real business behind these numbers is: a group of companies still burning money, losing money, and relying on restricted foreign toolchains to continue chip design, yet being valued in the secondary market as “domestic NVIDIA successors.”
And the tools they use to design chips are becoming part of NVIDIA’s ecosystem. The $2 billion binding between NVIDIA and Synopsys, and Cadence’s “exclusive based on Blackwell” label, turn the pursuit itself into a paradox.
A complete chain from design to manufacturing
Returning to the GTC conversation.
Dally was very modest throughout. “AI is still far from designing chips on its own,” NVIDIA has been saying this for four or five years. But the way they phrase it changes every year. Four years ago, it was “AI can assist in design,” three years ago, “AI can automate certain steps,” this year, “it can do in one night what 8 people and 10 months do.” Each year, they push forward, each year leaving a phrase: “the ultimate goal is still far.” Looking back three years, the previous “still far” has been achieved, and the new “still far” is defined as all competitors still out of reach.
What NVIDIA has done in the past twelve months is essentially one thing: applying AI to the most valuable, most moat-protected segments of the chip industry chain, then selling these tools layer by layer to the entire industry.
The front end of chip design is taken over by internal LLMs like Chip Nemo; the mid-end standard cell library migration and layout optimization are handled by NB-Cell and Prefix RL; the entire EDA toolchain is tied to NVIDIA’s GPU through Synopsys’s $2 billion investment and Cadence’s “exclusive based on Blackwell”; lithography calculations for manufacturing are handled by cuLitho, which TSMC is already using.
From design to manufacturing, NVIDIA has used AI to redo every segment. All these segments ultimately lead to one endpoint: to use the fastest tools, you must buy NVIDIA’s cards.
For any competitor aiming to design a chip capable of surpassing Blackwell, the most awkward reality has already arrived. The fastest version of the EDA tools needed to design that chip runs on NVIDIA GPUs; the fastest lithography algorithms are provided by NVIDIA; the computing power to train AI for design still comes from NVIDIA cards.
The person you want to beat is renting you all the tools needed to beat him. The rent is paid annually, and the contract increases every year.