Futures
Access hundreds of perpetual contracts
CFD
Gold
One platform for global traditional assets
Options
Hot
Trade European-style vanilla options
Unified Account
Maximize your capital efficiency
Demo Trading
Introduction to Futures Trading
Learn the basics of futures trading
Futures Events
Join events to earn rewards
Demo Trading
Use virtual funds to practice risk-free trading
Launch
CandyDrop
Collect candies to earn airdrops
Launchpool
Quick staking, earn potential new tokens
HODLer Airdrop
Hold GT and get massive airdrops for free
Pre-IPOs
Unlock full access to global stock IPOs
Alpha Points
Trade on-chain assets and earn airdrops
Futures Points
Earn futures points and claim airdrop rewards
Promotions
AI
Gate AI
Your all-in-one conversational AI partner
Gate AI Bot
Use Gate AI directly in your social App
GateClaw
Gate Blue Lobster, ready to go
Gate for AI Agent
AI infrastructure, Gate MCP, Skills, and CLI
Gate Skills Hub
10K+ Skills
From office tasks to trading, the all-in-one skill hub makes AI even more useful.
GateRouter
Smartly choose from 40+ AI models, with 0% extra fees
He Kaiming's ELF team: The language diffusion model has finally been successfully run
According to Beating Monitoring, MIT’s He Kaiming team released a language diffusion model ELF (Embedded Language Flows). It does not follow the GPT-style autoregressive “predict the next token” approach, but instead completes text generation within a continuous embedding space, only converting back to discrete tokens at the final step.
Diffusion models are already mature in image generation, but applying them to text has always been awkward: images are naturally continuous signals, while language is composed of discrete tokens. Previously, many continuous diffusion text models either repeatedly incorporate token-level supervision during the generation process or require an additional independent decoder. ELF’s approach is cleaner: most steps only denoise in the continuous vector space, with the final step using a shared-weight network to discretize.
The experimental results are also impactful. In the OpenWebText unconditional generation evaluation, ELF-B with 105 million parameters achieved about 24.1 perplexity (Gen. PPL) in 32 sampling steps, outperforming various discrete and continuous diffusion language model baselines. More importantly, ELF-B used only about 45 billion training tokens, while comparison methods typically used over 500 billion, roughly an order of magnitude fewer training tokens. This result at least suggests that the continuous diffusion route is not blocked by “language discreteness” in language modeling; previous issues are more likely related to modeling interfaces and sampling design.