Futures
Access hundreds of perpetual contracts
CFD
Gold
One platform for global traditional assets
Options
Hot
Trade European-style vanilla options
Unified Account
Maximize your capital efficiency
Demo Trading
Introduction to Futures Trading
Learn the basics of futures trading
Futures Events
Join events to earn rewards
Demo Trading
Use virtual funds to practice risk-free trading
CFD
U.S. stock CFD derivatives
US Stocks
Access real US stocks and ETFs
HK Stocks
Trade quality Hong Kong-listed stocks
Korean Stocks
SK Hynix
Real Korean stocks and top assets
Stock Futures
High leverage, 24/7 trading
Tokenized Stocks
Backed by real stock assets
IPO Access
Unlock full access to global stock IPOs
GUSD
Mint GUSD for Treasury RWA yields
Stocks Activities
Trade Popular Stocks and Unlock Generous Airdrops
Launch
CandyDrop
Collect candies to earn airdrops
Launchpool
Quick staking, earn potential new tokens
HODLer Airdrop
Hold GT and get massive airdrops for free
IPO Access
Unlock full access to global stock IPOs
Alpha Points
Trade on-chain assets and earn airdrops
Futures Points
Earn futures points and claim airdrop rewards
Promotions
AI
Gate AI
Your all-in-one conversational AI partner
Gate AI Bot
Use Gate AI directly in your social App
GateClaw
Gate Blue Lobster, ready to go
Gate for AI Agent
AI infrastructure, Gate MCP, Skills, and CLI
Gate Skills Hub
10K+ Skills
From office tasks to trading, the all-in-one skill hub makes AI even more useful.
Xiaomi discloses training details for its 1T MiMo-V2-Pro model: thousands of GPUs used, no ranks or grades, no deadlines
ME News news, April 24 (UTC+8). According to Dongcha Beating monitoring, Luo Fuli, head of Xiaomi’s large-model team, disclosed in her first in-depth interview that the MiMo-V2-Pro model’s base has a total parameter count of 1T, and training involved thousands of GPUs. She believes that the 1T scale is currently the baseline for achieving a level close to Claude Opus 4.6 and obtaining the entry ticket/qualification to compete in the next phase of Agent competition.
On the technical side, the Pro version pushes the ratio of global attention to sliding-window attention to an extreme sparse ratio of 7:1, controlling the inference cost of long texts while increasing the parameter count, and continues to use the MTP (Multi-Token Prediction) architecture to leverage surplus compute to accelerate inference.
On the management side, in the 100-person MiMo team, only 30 to 40 people directly take part in core iterations. The team has not set up job ranks, and there are no clear team divisions or delivery deadlines. When facing unstable numerical issues such as training loss spikes, the team chooses to stop training directly to investigate, even if it means shutting down for one to two weeks and spending millions in computational costs.
(Source: BlockBeats)