Futures
Access hundreds of perpetual contracts
CFD
Gold
One platform for global traditional assets
Options
Hot
Trade European-style vanilla options
Unified Account
Maximize your capital efficiency
Demo Trading
Introduction to Futures Trading
Learn the basics of futures trading
Futures Events
Join events to earn rewards
Demo Trading
Use virtual funds to practice risk-free trading
Launch
CandyDrop
Collect candies to earn airdrops
Launchpool
Quick staking, earn potential new tokens
HODLer Airdrop
Hold GT and get massive airdrops for free
Pre-IPOs
Unlock full access to global stock IPOs
Alpha Points
Trade on-chain assets and earn airdrops
Futures Points
Earn futures points and claim airdrop rewards
Promotions
AI
Gate AI
Your all-in-one conversational AI partner
Gate AI Bot
Use Gate AI directly in your social App
GateClaw
Gate Blue Lobster, ready to go
Gate for AI Agent
AI infrastructure, Gate MCP, Skills, and CLI
Gate Skills Hub
10K+ Skills
From office tasks to trading, the all-in-one skill hub makes AI even more useful.
GateRouter
Smartly choose from 40+ AI models, with 0% extra fees
Zyphra releases the first diffusion language model in the AMD ecosystem, achieving a maximum speedup of 7.7 times
AIMPACT News, May 15 (UTC+8), according to Beating Monitoring, Zyphra released ZAYA1-8B-Diffusion-Preview, a hybrid expert (MoE) diffusion model transformed from an autoregressive large language model. Although the official promotion claims it as the "first" model to implement this architecture transformation, this approach was already pioneered by teams like SDAR and LLaDA 2.0 at the end of last year. The true uniqueness of ZAYA1 lies in the fact that it is the first diffusion language model trained within the AMD hardware ecosystem.
Setting aside marketing rhetoric, the model still validates the engineering efficiency benefits of the diffusion architecture. Traditional autoregressive models are limited by word-by-word serial generation, and accumulating KV caches can push generation speed to physical limits. As recently revealed by the industry trend from the He Kaiming team’s pure diffusion model ELF, parallel denoising is the key to breaking this bottleneck.
ZAYA1 adopts the TiDAR scheme to skip from-scratch pretraining, enabling simultaneous denoising of 16 token candidates in a single forward pass, completely transforming the VRAM bandwidth bottleneck into a compute bottleneck.
Practical tests show that, combined with ZAYA1’s dedicated CCA attention mechanism, using a standard lossless sampler can achieve a 4.6x decoding acceleration ratio without compromising generation quality. Switching to a hybrid logit sampler further boosts the acceleration ratio to 7.7x, providing substantial cost reduction for large-scale inference tasks that are time-consuming. (Source: BlockBeats)