Futures
Access hundreds of perpetual contracts
CFD
Gold
One platform for global traditional assets
Options
Hot
Trade European-style vanilla options
Unified Account
Maximize your capital efficiency
Demo Trading
Introduction to Futures Trading
Learn the basics of futures trading
Futures Events
Join events to earn rewards
Demo Trading
Use virtual funds to practice risk-free trading
CFD
U.S. stock CFD derivatives
US Stocks
Access real US stocks and ETFs
HK Stocks
Trade quality Hong Kong-listed stocks
Stock Futures
High leverage, 24/7 trading
Tokenized Stocks
Backed by real stock assets
IPO Access
Unlock full access to global stock IPOs
GUSD
Mint GUSD for Treasury RWA yields
Stocks Activities
Trade Popular Stocks and Unlock Generous Airdrops
Launch
CandyDrop
Collect candies to earn airdrops
Launchpool
Quick staking, earn potential new tokens
HODLer Airdrop
Hold GT and get massive airdrops for free
IPO Access
Unlock full access to global stock IPOs
Alpha Points
Trade on-chain assets and earn airdrops
Futures Points
Earn futures points and claim airdrop rewards
Promotions
AI
Gate AI
Your all-in-one conversational AI partner
Gate AI Bot
Use Gate AI directly in your social App
GateClaw
Gate Blue Lobster, ready to go
Gate for AI Agent
AI infrastructure, Gate MCP, Skills, and CLI
Gate Skills Hub
10K+ Skills
From office tasks to trading, the all-in-one skill hub makes AI even more useful.
Anthropic Research: Domain expertise more than coding ability determines Claude Code generation performance
Anthropic analyzed approximately 400k Claude Code sessions and about 235k users, discovering that the key to AI coding success or failure is not whether one can write code, but how deep their understanding of the problem domain is.
(Background: Anthropic releases Claude Code economic study! AI agent cost-saving potential reaches 4 billion)
(Additional context: Anthropic launches AI impact dashboard: input your profession, instantly see how much of your job AI can take over)
Table of Contents
Toggle
In their latest research report, Anthropic analyzed about 235k user samples and found that what truly determines AI effectiveness is how well the "instruction giver" understands the problem they are solving.
How an accountant can become a "specialist" in Claude's eyes
This study by Anthropic covers roughly 400k Claude Code sessions from October 2025 to April 2026.
The report establishes a five-level task-specific professionalism scale, from novice to expert. The key lies in how this "professionalism" is defined, which differs from common assumptions. Simply put: It’s how well you understand the problem you’re trying to solve, not how good you are at coding.
The example given is straightforward: a senior engineer writing Rust for the first time is considered a novice for that task; conversely, an accountant who has never used Python, but can precisely tell Claude the accounting rules that must be satisfied and identify logical errors at month-end closing, is an expert in that task.
The numerical differences directly illustrate the severity of the issue. A novice session triggers about 5 Claude actions on average per prompt, producing around 600 words; an expert session triggers about 12 actions, producing roughly 3,200 words—more than double the actions and five times the output of the novice.
Regression analysis by Anthropic shows that each increase in professionalism level results in approximately a 9% increase in Claude’s actions and about a 13% increase in output. This relationship remains significant even after controlling for work type, task value, month, profession, and model version.
After mistakes, who can steer the agent back on track
Success rate figures further clarify the issue. Anthropic defined two success criteria: "judgment success" (the classifier determines whether the goal is met after reading the conversation) and "verification success" (requires verifiable hard evidence, such as tests, git commits, or explicit user confirmation).
Overall, the higher the user’s professionalism, the higher the probability of session success. Most of the improvement is concentrated at the lower end of the scale; the gap from novice to intermediate is larger than from intermediate to expert. Anthropic found that verification success rate in expert-level sessions is more than twice that of novices.
Even more interesting is the "post-error recovery rate." Anthropic tracked sessions that encountered issues—conversations with failure signals. In these sessions, verification success rate rose from 4% for novices to 15% for experts; the proportion of at least partial success was 60% for novices and 80-81% for intermediate to expert.
The gap in abandonment rates is also significant. When sessions encounter difficulties, novices have a 19% chance of giving up immediately (judging failure and producing zero code), while other levels only have 5-7%. Anthropic interprets this as: domain expertise is valuable because it enables the user to guide the agent back on track when it goes astray.
This finding points to an counterintuitive conclusion: "Understanding the problem" is more important than "knowing the tools." Because understanding the problem allows you to identify errors when Claude gives incorrect answers, specify boundary conditions precisely, and immediately correct strange decisions made by the agent.
Managers outperform software engineers; occupational differences nearly disappear
Anthropic’s data challenges another expectation: professional background is not as important as one might think.
Overall success rate for software-related professions is about 30%, while other professions are around 26%. Looking only at sessions that produce actual code, the gap widens to 34% vs. 29%. But if you relax the success criterion to "at least partial success," both groups are nearly equal: 89% vs. 88%.
More notably, each of the top ten professions falls within 7 percentage points of the software engineer verification success rate. Management roles even slightly outperform software engineers. Anthropic speculates that managers’ habit of assigning tasks and setting specifications translates well into commanding the agent.
Work patterns have also evolved rapidly over seven months. Bug fixing sessions decreased from 33% to 19%, nearly halving; operations like deployment, configuration, and pipeline execution increased from 14% to 21%; writing and data analysis roughly doubled from 10% to 20%.
In other words, users are applying Claude Code to more "peripheral programming tasks," not just coding itself.
The economic value of tasks has also risen in tandem. Anthropic estimates the market value per session based on freelance project rates, with an average increase of about 27% over seven months; construction tasks up about 43%, operational tasks about 34%, and repair tasks about 32%.
Having a basic to intermediate understanding of a domain allows you to reap most benefits; climbing from intermediate to expert significantly flattens the success rate curve.
As AI tools continue to expand, they amplify not coding skills but your depth of understanding of the problem. Those who do not understand what they are trying to solve will only get more lost faster, even with more powerful models.