Futures
Access hundreds of perpetual contracts
CFD
Gold
One platform for global traditional assets
Options
Hot
Trade European-style vanilla options
Unified Account
Maximize your capital efficiency
Demo Trading
Introduction to Futures Trading
Learn the basics of futures trading
Futures Events
Join events to earn rewards
Demo Trading
Use virtual funds to practice risk-free trading
Launch
CandyDrop
Collect candies to earn airdrops
Launchpool
Quick staking, earn potential new tokens
HODLer Airdrop
Hold GT and get massive airdrops for free
Pre-IPOs
Unlock full access to global stock IPOs
Alpha Points
Trade on-chain assets and earn airdrops
Futures Points
Earn futures points and claim airdrop rewards
Promotions
AI
Gate AI
Your all-in-one conversational AI partner
Gate AI Bot
Use Gate AI directly in your social App
GateClaw
Gate Blue Lobster, ready to go
Gate for AI Agent
AI infrastructure, Gate MCP, Skills, and CLI
Gate Skills Hub
10K+ Skills
From office tasks to trading, the all-in-one skill hub makes AI even more useful.
GateRouter
Smartly choose from 40+ AI models, with 0% extra fees
Disaster! AI traders collectively fail, losing one-third in two weeks. Retail investors still dare to entrust their money to machines?
Artificial intelligence is knocking on Wall Street’s door, but the first report card it hands in looks as ugly as a car crash scene.
Early data from a series of public trading competitions show that mainstream large language models generally perform poorly in autonomous trading—most systems incur losses, trade at an incredible frequency, and give completely different decisions in response to the same instructions.
The most typical case comes from the Alpha Arena competition run by tech startup Nof1. They put eight cutting-edge AI systems—Anthropic’s Claude, Google’s Gemini, OpenAI’s ChatGPT, Musk’s Grok, and others—into four rounds of competition. Before each round, they give each model $10k to trade US tech stocks autonomously over two weeks.
And the result? The overall portfolio lost about one-third. Out of 32 trading outcomes, only 6 were profitable. Nof1 founder Jay Azhang candidly said, “Now, handing money directly to large models to trade on their own just doesn’t work.”
Data reveals multiple flaws of current AI in trading scenarios. Using the same prompt, Alibaba’s Qwen executed 1,418 trades in one round, while the best-performing Grok only made 158 trades. Grok’s best result came during the round when it could observe the performance of its competitors.
The AI blog Flat Circle tracked 11 market-related arenas, showing that each arena had at least one model that achieved profitability, but only two arenas had median models with positive returns—most models can’t beat the market.
Decision differences among models are even more headache-inducing. Azhang explained that in the latest Alpha Arena test, Claude leaned toward long positions, Gemini was completely comfortable with shorting, and Qwen liked to take high-leverage bets.
Doug Clinton, head of Intelligent Alpha, which manages LLM-driven funds, said, “They each have their ‘personality,’ managing them is almost like managing a human analyst.” But by informing models of certain biases, results can be improved to some extent.
Azhang pointed out that large models have advantages in research and tool invocation, but their trading execution is clearly lacking: they can’t understand the weights of variables like analyst ratings, insider trading, or sentiment shifts, so they tend to buy at highs and sell at lows, and can’t manage positions well.
Intelligent Alpha’s benchmark tests offer a relatively positive reference. They provided 10 AI models with financial documents, analyst forecasts, earnings call transcripts, macroeconomic data, and web search capabilities, only judging the direction of profit forecasts. In Q4 2025, ChatGPT’s prediction accuracy reached 68%, setting a record. Clinton said that each new version release generally improves the models’ performance.
There is a fundamental methodological obstacle in evaluating AI trading ability: traditional quantitative strategies rely on backtesting, but this almost fails for large models—an AI asked in 2026 how to trade the March 2020 market already “knew” the historical trend. This “lookahead bias” forces researchers to rely on live trading assessments, leading to the rapid emergence of various competitions.
Jim Moran, author of the Flat Circle blog and co-founder of alternative data firm YipitData, believes that most current public experiments are too short-term and noisy to support definitive conclusions. These arenas also have inherent disadvantages, such as lack of proprietary stock research resources and low execution quality. He said, “If you transplant one of these AI agents from the arena directly into a top hedge fund, its performance should be better.”
Alexander Izydorczyk, former head of data science at Coatue Management and now at NX1 Capital, recently wrote that among the AI trading bots he tracks, none show sustained alpha-generating ability. He believes the limitations of these arenas lie in the lack of practical quantitative techniques used by secretive trading institutions in their training data.
But he left a thought-provoking judgment: “Beginners sometimes see things that veterans miss.” He wrote on his personal blog, “When large model trading strategies really start to work, you won’t hear about it right away.”
Nof1 is preparing for Season 2 of Alpha Arena, planning to give each AI model web search capabilities, longer thinking time, more data sources, and multi-step execution abilities. But the company’s core business model is providing retail traders with system tools to build AI trading agents—not directly deploying AI into trading seats.
This positioning itself may already be the most pragmatic footnote to the current AI trading capabilities.
Follow me: Get more real-time analysis and insights into the crypto markets! $BTC $ETH $SOL
#Gate广场五月交易分享 #BTC correction #CLARITY Act progress blocked