When will the U.S. stock market's "Long Live Chips" rally end?

Question

Author: Sun Cheng; Source: BarronsIn April, the U.S. stock market was immersed in a structural celebration led by AI, with the Nasdaq soaring 15.3% in a single month, the S&P 500 rising 10.4%, and the Philadelphia Semiconductor Index hitting its largest monthly gain since 2000. After May, the market's script quickly changed, with the three major U.S. indices oscillating at high levels. Beneath the surface calm, there were turbulent currents: ARM's stock price surged 42.58% in one week on strong earnings reports, and storage giants like SanDisk and Seagate each gained over 15% in a week, while former AI leaders Nvidia and Microsoft saw their stock prices decline.Is this the end of the AI mainline rally, or a profound internal shift? Capital is flowing from GPUs and cloud giants toward ARM architecture and storage chips, as the market seems to be trading a new logic of “training compute capacity peaking, inference and storage taking over.” The Dow hit new highs—does this mean style rotation has already begun? Is ARM’s rapid rise the start of a valuation re-rating, or the peak of a short-term bubble?Key viewpoints:-----1. The AI boom is not over, but has entered a refined differentiation stage: capital continues to flow into AI, but from broad gains to selective bets. Internally, funds are shifting from GPUs and cloud giants toward ARM architecture and storage chips, switching trading logic to “training capacity peaking, inference and storage taking over.”2. The herd effect remains strong, but the targets are changing: Quantitative trading, end-of-life options, and leveraged ETFs have altered the market ecosystem, with market makers’ hedging behaviors amplifying gains and declines. The herd effect has not faded but has shifted from Nvidia and similar stocks to new hotspots like ARM and SanDisk, leading to “mind-boggling” surges.3. ARM’s surge is a typical short squeeze, unsustainable: Gamma squeezing and short covering are superimposed, causing short-term stock price spikes. Such trends cannot last forever; once buying momentum exhausts or sentiment wavers, prices may quickly reverse downward, so caution is advised.4. ARM architecture has inherent advantages in edge inference: As a RISC reduced instruction set, ARM’s energy efficiency is nearly half that of x86; its IP licensing model supports customized heterogeneous designs; matrix computation optimizations are better suited for Transformer models. A long-term industry inflection point has formed, but short-term stock volatility remains intense.5. Nvidia’s position in training remains solid, but inference faces challenges: The CUDA ecosystem’s moat is hard to breach, with no substantial challengers in training. However, in inference, AMD and self-developed chips from Google and Microsoft are gaining opportunities, as “the world has long suffered from Nvidia,” and Nvidia’s market share will decline but remain dominant.6. Inference compute load has already surpassed training, and will account for 70-80% in the future: Currently, inference and training workloads are roughly in a 6:4 ratio, but capital expenditure still allocates 60% to training. With the proliferation of large models and the development of Agentic AI, along with slower training iterations, the stringent demands on cost, power consumption, and latency in inference create new opportunities for ARM and storage chips.Market review: Is the AI-led celebration the start or a bubble?--------------------Sun Cheng: Has the AI-led rally ended? Capital is flowing from GPUs/cloud giants to ARM and storage—what are the core driving forces behind this?Cat Sister: Since 2023, U.S. stocks have been focusing heavily on AI. The most prominent stock in 2023 was Nvidia, which is in semiconductors but driven mainly by AI. This wave has persisted into 2024 and 2025, with a layered, rhythmic bloom across sectors. So, it’s premature to say the AI boom has ended.Based on our latest monitoring of U.S. stock data last Friday, capital inflow into AI remains undiminished. But there’s a notable feature: “differentiation within differentiation.” Since October last year, the overall U.S. market has been in a stagnation phase, sideways from October to February. During this period, some software stocks and large tech stocks impacted by AI have shown signs of exhaustion and correction. Meanwhile, some hot AI sectors like storage and optical communications not only did not decline but rose, reflecting market segmentation and focused capital deployment.Therefore, from a capital monitoring perspective, both new and existing investors remain very active. The market continues to seek hotspots, differentiation, new themes, and companies across different sectors within the AI big picture. The rally has not ended but has entered a more refined phase.Sun Cheng: The Nasdaq and S&P 500 posted their best monthly performance in nearly six years, and the Philadelphia Semiconductor Index saw its largest single-month gain since 2000. Beyond AI, what specific aspects of market sentiment recovery and herd effects are reflected in this extreme market? Since mid-May, the market has not continued broad-based gains—does this indicate a shift from “overall optimism” to “selective betting”? Does herd behavior still exist?Cat Sister: This can be viewed from two angles. First, a typical technical pattern in U.S. stocks: after prolonged sideways or declining phases, stocks often experience a rebound—like the rally in 2020, or the nearly half-year surge following the March to October 2023 correction. Since October last year, the U.S. market has been in a stagnation phase, with the focus points unchanged over several months—classic accumulation and consolidation. The last sharp decline in March was a significant valuation correction, with March 30 marking the bottom. Therefore, the April surge is essentially a recovery from this sideways period. A key feature of the U.S. market is “stay in the game”: you must remain invested because the rebound can happen in a day or two. Data shows that over the past 50, 30, or 10 years, missing these few days of rebound and surge can lead to a substantial annual performance gap.Second, herd effects still exist, but the market now is fundamentally different from before. Quantitative trading, algorithmic trading, retail participation, and massive options and leveraged ETF activity have fundamentally changed the ecosystem. Market makers hedge risks through certain behaviors, which can amplify upward or downward movements. Especially in top AI stocks, retail investors are engaging in end-of-life options, while large institutional players chase these hotspots. This accelerates both gains and declines, making herd effects in the U.S. market increasingly strong—phenomena like “surging beyond common sense” are more frequent. So, the market isn’t shifting from optimism to pessimism; rather, the herd targets are changing, and capital remains concentrated but on different stocks.Sun Cheng: ARM’s stock surged over 42% in a week, completing such a large increase in just two trading days. Is this technically a “short squeeze”? Is retail sentiment or gamma squeezing from options a factor? Can such rapid growth be sustained?Cat Sister: This is indeed a very typical short squeeze, with gamma squeezing playing a significant role. In today’s U.S. stock market, once a hot stock emerges, all institutions, large players, and retail investors chase after it. The widespread presence of end-of-life options creates gamma squeeze issues. Market makers, many of whom do not bet on direction but profit from spreads, face huge risks: if the stock rises, they need to hedge by buying the underlying shares to delta-hedge. When most market participants buy end-of-life or near-expiry out-of-the-money call options, market makers are forced to buy large amounts of stock to hedge, pushing prices higher in a feedback loop. Many recent late-day effects in U.S. stocks are caused by this squeeze. When prices are driven up repeatedly, excited investors may further increase their call option positions, which in turn forces market makers to buy more stock, fueling the rise.There’s also short covering: some investors shorted based on valuation concerns, but as prices keep rising due to these mechanisms, they are forced to cover and buy stock, creating a short squeeze. If this pattern persists longer and the curve becomes steeper, it can become a classic short squeeze.Regarding sustainability, technically, such a rally cannot last forever. Once all shorts are forced to cover or new buyers lose confidence amid volatility, the market can quickly reverse, especially if profit-taking or sentiment shifts. Like the upward trend, this also has an amplifying effect. So, extreme caution is necessary.Deep Dive into AI Tech: ARM’s Rise, Nvidia Under Pressure, Storage Takes Over-------------------------Sun Cheng: ARM surged 42.58% last week, with the core trading logic being “AI shifting from training to edge inference, ARM architecture benefiting significantly.” From a technical perspective, how do ARM’s instruction set and licensing model have inherent advantages in AI inference compared to x86? Is this valuation re-rating driven by short-term earnings catalysts or a long-term industry inflection point?Wang Huai: From a technical standpoint, the shift toward edge inference in AI compute is quite clear. ARM’s advantages over x86 in edge inference are mainly threefold:First, instruction set advantage. ARM is RISC (Reduced Instruction Set Computing), while x86 is CISC (Complex Instruction Set Computing). RISC simplifies pipeline design, allowing more cores, caches, or dedicated accelerators within the same chip area, which benefits near-memory computing. In AI inference, memory communication is a major bottleneck. ARM’s flexible physical IP and customization capabilities make it naturally suited for AI inference calculations. Under typical edge workloads, ARM’s energy efficiency significantly outperforms x86—power consumption can be nearly halved, offering huge cost advantages.Second, licensing model advantage. ARM’s IP licensing approach allows customers to customize designs—such as integrating CPU, GPU, and NPU components and optimizing their cooperation. In contrast, x86 is a “black box,” making heterogeneous layered optimization difficult. This openness makes ARM more attractive for designing high-bandwidth memory (LPDDR) integration.Third, matrix computation optimization. Matrix multiplication, heavily used in Transformer models, is much more efficient on ARM. While x86 can also perform matrix multiplication efficiently, achieving the same throughput with comparable energy and area efficiency is less feasible than with ARM’s vector solutions, especially at the edge. If compatibility and maturity are priorities, x86 is a choice; but if cost-effectiveness and power efficiency are paramount—especially considering future large-scale deployments—ARM has clear advantages.From industry trends, AI is shifting from training to inference, especially toward Agentic AI (agent-based AI). This involves not just large model conversations but also extensive API, network, and file calls, which rely heavily on CPUs. As a CPU architecture, ARM is well-positioned for edge applications (PCs, smartphones, automotive) in Agentic AI. I believe the overall direction is correct, and the industry inflection point is recognized. However, short-term stock volatility will be intense; long-term, these directions will solidify value. The long-term is a weighing machine; short-term is a sentiment voting machine.Sun Cheng: Nvidia fell 2.58% last week amid concerns of slowing growth. AMD rose 3.60%. In AI chips, is Nvidia’s CUDA ecosystem moat still solid? Can AMD’s MI series GPUs narrow the gap in inference? How big is the threat of customer self-developed chips to Nvidia?Wang Huai: Currently, no one can challenge Nvidia’s dominance in training. But as large models and Agentic AI develop, inference compute demand surges, creating opportunities for others. Inference is more complex, requiring heterogeneous architectures like CPU, GPU, NPU working together, with high memory demands. This opens the door for AMD and others to gain a share.AMD will definitely benefit from the “world has long suffered from Nvidia” sentiment in inference. Companies need a reliable second supplier to ensure supply chain security and avoid dependence on a single vendor. As long as a viable second supplier emerges, customers will favor it. AMD’s data center business is growing rapidly, but overall revenue growth still lags behind stock price gains—this reflects both emotional factors and real opportunities in inference.For Nvidia, sales in inference will be affected but not critically. The CUDA moat isn’t easily surpassed overnight. Meanwhile, Google, Microsoft, Amazon, and others are developing their own inference chips to reduce reliance on Nvidia and enable customized heterogeneous designs for their specific scenarios. This trend is clear. But the most versatile, most general-purpose chips remain Nvidia’s. In the next two to three years, Nvidia will still hold a major share in inference, though the concentration will be less than in training. No substantial challengers exist for training in the short term.Sun Cheng: Market opinion suggests AI investment is shifting from “training compute” to “inference applications.” What is the current global workload ratio of training to inference? How will this change over the next two years? What new technical challenges do this transition pose for chip design, storage bandwidth, and power consumption?Wang Huai: I’ve reviewed some reports, and currently, the workload ratio of inference to training is roughly 6:4 or 5.5:4.5, with inference already surpassing training. But capital expenditure (Capex) is still skewed: about 60% for training, 40% for inference, because training requires massive supernodes with high costs.In the future, inference’s share will increase significantly. First, large models are still underpenetrated globally—86% of people on Earth have never used large models. Besides China and the U.S., many countries have minimal large-model applications. Second, Agentic AI will greatly increase API calls and network/file interactions, boosting inference demand. Over the next three to five years, inference workload could reach 70-80%.Another reason is that training iteration speed has slowed considerably. Fewer companies can afford large-scale training, leading to a concentration of resources. As competition diminishes, companies prefer to monetize existing models rather than develop new ones rapidly. This reduces overall training compute demand. Unless breakthroughs in embodied intelligence or world models occur at a scale comparable to Nankai, this will remain a variable.This shift from training to inference demands higher chip design standards. Inference is highly sensitive to cost, power, and latency, and must adapt to diverse scenarios from cloud to edge. This creates opportunities for different architectures (GPU, ASIC, CPU). Memory bandwidth and KV cache requirements are more complex than in training. These technical challenges open markets for architectures like ARM and storage chips.

When will the U.S. stock market's "Long Live Chips" rally end?

Key viewpoints:

Market review: Is the AI-led celebration the start or a bubble?

Deep Dive into AI Tech: ARM’s Rise, Nvidia Under Pressure, Storage Takes Over

Trending Topics

WinGoldBarsWithGrowthPoints

WTICrudeFallsBelow90Dollars

StockTradingChallengeUpTo17000U

USIranNegotiationGame

TradeCFDWinGold

Pinned