Futures
Access hundreds of perpetual contracts
CFD
Gold
One platform for global traditional assets
Options
Hot
Trade European-style vanilla options
Unified Account
Maximize your capital efficiency
Demo Trading
Introduction to Futures Trading
Learn the basics of futures trading
Futures Events
Join events to earn rewards
Demo Trading
Use virtual funds to practice risk-free trading
Launch
CandyDrop
Collect candies to earn airdrops
Launchpool
Quick staking, earn potential new tokens
HODLer Airdrop
Hold GT and get massive airdrops for free
IPO Access
Unlock full access to global stock IPOs
Alpha Points
Trade on-chain assets and earn airdrops
Futures Points
Earn futures points and claim airdrop rewards
Promotions
AI
Gate AI
Your all-in-one conversational AI partner
Gate AI Bot
Use Gate AI directly in your social App
GateClaw
Gate Blue Lobster, ready to go
Gate for AI Agent
AI infrastructure, Gate MCP, Skills, and CLI
Gate Skills Hub
10K+ Skills
From office tasks to trading, the all-in-one skill hub makes AI even more useful.
GateRouter
Smartly choose from 40+ AI models, with 0% extra fees
Let AI modify its own training code—Recursive refresh of three algorithm optimization records
ME AI News, according to Beating Monitoring, AI startup Recursive announced the first batch of experimental results from its research system. The system can automatically generate ideas, write code, run experiments, and verify results, surpassing publicly available best results in three benchmarks: fixed-budget training, NanoGPT ultra-fast training, and GPU kernel optimization. Experiments show that in tasks with clear goals and rapid feedback, the system has already identified optimization opportunities missed by humans.
In the 5-minute NanoChat Autoresearch training, the system reduced the validation loss BPB to 0.9109, shortening the training time to reach the same loss by about 23% (speeding up 1.3 times). The key change is enhancing short-context memory by hashing bigram and trigram tokens into a fixed embedding table, then mixing them into the attention value path through learnable gating, allowing direct utilization of local information at very low overhead.
In the NanoGPT Speedrun, which has been optimized by the community for over two years, the system reduced the time to reach the target loss from 79.7 seconds to 77.5 seconds. Optimization methods include advancing FP8 forward computation in the attention path to increase throughput, and rewriting fused MLP kernels to only store squared ReLU activations and recompute intermediate variables during backpropagation to reduce memory read/write.
In the GPU kernel optimization benchmark SOL-ExecBench, the system improved the average SOL score (approaching the theoretical limit) on NVIDIA B200 from 0.699 to 0.754, reducing the gap to the physical limit by 18%. The generated solutions include absorbing GRN scaling into subsequent linear layer weights, packing expert routing scores and indices into key-value pairs for warp-intra reduction, and using low-level PTX instructions to pack FP4 in NVFP4 MoE kernels, while retaining FP32 in intermediate calculations to reduce error accumulation.
To prevent AI from exploiting loopholes to inflate scores, the system introduces multi-level correctness auditing to filter out invalid speedups. (Source: BlockBeats)