Futures
Access hundreds of perpetual contracts
CFD
Gold
One platform for global traditional assets
Options
Hot
Trade European-style vanilla options
Unified Account
Maximize your capital efficiency
Demo Trading
Introduction to Futures Trading
Learn the basics of futures trading
Futures Events
Join events to earn rewards
Demo Trading
Use virtual funds to practice risk-free trading
CFD
U.S. stock CFD derivatives
US Stocks
Access real US stocks and ETFs
HK Stocks
Trade quality Hong Kong-listed stocks
Korean Stocks
SK Hynix
Real Korean stocks and top assets
Stock Futures
High leverage, 24/7 trading
Tokenized Stocks
Backed by real stock assets
IPO Access
Unlock full access to global stock IPOs
GUSD
Mint GUSD for Treasury RWA yields
Stocks Activities
Trade Popular Stocks and Unlock Generous Airdrops
Launch
CandyDrop
Collect candies to earn airdrops
Launchpool
Quick staking, earn potential new tokens
HODLer Airdrop
Hold GT and get massive airdrops for free
IPO Access
Unlock full access to global stock IPOs
Alpha Points
Trade on-chain assets and earn airdrops
Futures Points
Earn futures points and claim airdrop rewards
Promotions
AI
Gate AI
Your all-in-one conversational AI partner
Gate AI Bot
Use Gate AI directly in your social App
GateClaw
Gate Blue Lobster, ready to go
Gate for AI Agent
AI infrastructure, Gate MCP, Skills, and CLI
Gate Skills Hub
10K+ Skills
From office tasks to trading, the all-in-one skill hub makes AI even more useful.
Stop blindly piling on computing power! Research shows that large models become more "rigid" as they are trained, and increasing parameters won't help.
ME AI news, according to Beating monitoring: as AI training time increases, it gradually loses the ability to absorb new knowledge (loss of plasticity). Ultimately, the more it trains, the more rigid it becomes. If the loss of plasticity cannot be overcome, large models can never continuously learn at low cost. Each time knowledge is updated, it must be retrained by putting all historical data and new data together, consuming massive computing power.
AI startup Zyphra’s latest research is the first to prove that increasing model size may delay degradation, but marginal benefits diminish—simply stacking parameters cannot fundamentally cure the loss of plasticity. Extrapolation shows that a 1B-parameter model will become dumber after training on 1.8 trillion tokens, while a 7B model will show signs after 9 trillion. Even more disruptive: even without task switching—just training the model on a stable mixed dataset—loss of plasticity still occurs.
The study points out three direct reasons why large models become dumber: first, the parameter volume keeps growing during training, and under the LayerNorm mechanism it obstructs gradient propagation; second, large-scale neuron dormancy in the MLP layer (“work stoppage”)—in some models, even 95% of neurons go into dormancy; and third, attention head paralysis (collapsing while only focusing on certain characters) or “phoning it in” (evenly smearing across all contexts). For these pathological features, potential treatment approaches include limiting parameter expansion, periodically giving dormant neurons a “neural reset” to forcibly reactivate them, and introducing random noise into the attention mechanism to forcibly correct deviations.
(Source: BlockBeats)