Futures
Access hundreds of perpetual contracts
CFD
Gold
One platform for global traditional assets
Options
Hot
Trade European-style vanilla options
Unified Account
Maximize your capital efficiency
Demo Trading
Introduction to Futures Trading
Learn the basics of futures trading
Futures Events
Join events to earn rewards
Demo Trading
Use virtual funds to practice risk-free trading
CFD
U.S. stock CFD derivatives
US Stocks
Access real US stocks and ETFs
HK Stocks
Trade quality Hong Kong-listed stocks
Stock Futures
High leverage, 24/7 trading
Tokenized Stocks
Backed by real stock assets
IPO Access
Unlock full access to global stock IPOs
GUSD
Mint GUSD for Treasury RWA yields
Stocks Activities
Trade Popular Stocks and Unlock Generous Airdrops
Launch
CandyDrop
Collect candies to earn airdrops
Launchpool
Quick staking, earn potential new tokens
HODLer Airdrop
Hold GT and get massive airdrops for free
IPO Access
Unlock full access to global stock IPOs
Alpha Points
Trade on-chain assets and earn airdrops
Futures Points
Earn futures points and claim airdrop rewards
Promotions
AI
Gate AI
Your all-in-one conversational AI partner
Gate AI Bot
Use Gate AI directly in your social App
GateClaw
Gate Blue Lobster, ready to go
Gate for AI Agent
AI infrastructure, Gate MCP, Skills, and CLI
Gate Skills Hub
10K+ Skills
From office tasks to trading, the all-in-one skill hub makes AI even more useful.
DeepSeek launches an image recognition mode, supporting visual CoT reasoning based on the withdrawn primitive framework
According to Beating Monitoring, DeepSeek’s web and app clients officially launched Vision Mode (image understanding). It is placed above the chat input box, alongside Quick Mode and Expert Mode. The newly launched visual understanding capability is not just optical character recognition (OCR); it is designed for in-depth scene analysis, spatial logical reasoning, and converting UI screenshots directly into structured HTML code. For high-difficulty geometric derivations or complex chart analysis, the system automatically activates a deep thinking model to provide a complete chain of reasoning.
Vision Mode’s underlying technology is based on the research framework “Thinking with Visual Primitives” published by the DeepSeek team. A paper jointly authored by multimodal researcher Xiaokang Chen with Peking University and Tsinghua University points out that current visual language models suffer from a “Reference Gap” in fine-grained localization and spatial reasoning—meaning it is difficult to describe complex visual coordinates using vague natural language. To address this, the research team upgrades coordinate points and bounding boxes to the minimal units of thought, inserting spatial primitives directly into the model’s reasoning chain (CoT) for visual inference, enabling spatial referencing to happen in parallel during the thinking process.
The foundational academic paper and open-source project for the visual capability were briefly released on April 30, but were immediately withdrawn without notice by DeepSeek on May 1, triggering widespread industry speculation about excessive disclosure of technical details and the model’s subsequent optimization. The officially launched Vision Mode supports only image input at present; it does not yet support multimodal formats such as video or audio, and the model currently has no image generation capability.