Futures
Access hundreds of perpetual contracts
CFD
Gold
One platform for global traditional assets
Options
Hot
Trade European-style vanilla options
Unified Account
Maximize your capital efficiency
Demo Trading
Introduction to Futures Trading
Learn the basics of futures trading
Futures Events
Join events to earn rewards
Demo Trading
Use virtual funds to practice risk-free trading
Launch
CandyDrop
Collect candies to earn airdrops
Launchpool
Quick staking, earn potential new tokens
HODLer Airdrop
Hold GT and get massive airdrops for free
IPO Access
Unlock full access to global stock IPOs
Alpha Points
Trade on-chain assets and earn airdrops
Futures Points
Earn futures points and claim airdrop rewards
Promotions
AI
Gate AI
Your all-in-one conversational AI partner
Gate AI Bot
Use Gate AI directly in your social App
GateClaw
Gate Blue Lobster, ready to go
Gate for AI Agent
AI infrastructure, Gate MCP, Skills, and CLI
Gate Skills Hub
10K+ Skills
From office tasks to trading, the all-in-one skill hub makes AI even more useful.
GateRouter
Smartly choose from 40+ AI models, with 0% extra fees
MiniMax open-source Blackwell-exclusive attention library, M3 weights expected to be released this Friday
ME AI News, according to Beating monitoring, MiniMax developer relations head Ryan Lee announced that MiniMax Sparse Attention (MSA), a high-performance attention library for NVIDIA Blackwell (SM100) GPUs, has officially been open-sourced under the MIT license. Ryan Lee also said that the MiniMax-M3 weights are expected to be released this Friday.
MSA has been applied to million-scale context inference for MiniMax-M3 by filtering the most relevant KV blocks within each GQA group and performing attention computation only on the selected blocks. The paper shows that, for a context of 1 million tokens, compared with the same-configuration Dense GQA, MSA can reduce attention computation by 28.4 times, and achieve 14.2 times prefill acceleration and 7.6 times decoding acceleration on H800 GPUs.
The open-source version integrates two sets of implementations—C++ JIT and CuTe-DSL—within the same Python package, and also provides Dense FlashAttention and Sparse Top-k Attention kernels, supporting multiple precision formats such as BF16, FP8, NVFP4, and FP4. Currently, it is mainly deployed on NVIDIA Blackwell (SM100) GPUs.
(Source: BlockBeats)