Does AI make experts dumber? Latest study in Nature: doctors' error detection rate drops by 6%, engineers score 17 points lower on tests

Polish colonoscopy research shows that after introducing AI assistance, the adenoma detection rate drops from 28.4% to 22.4% during sessions when AI is disabled; Anthropic's randomized controlled trial with 52 junior engineers indicates that the AI group scored an average of 50 on post-tests, while the manual group scored 67, a difference equivalent to two letter grades.
(Background summary: Anthropic study: domain expertise matters more than programming ability in determining Claude Code generation effectiveness)
(Additional context: Claude writes 80% of its code itself; Anthropic calls for a "global design brake mechanism"—is it serious?)

Table of Contents

Toggle

  • After doctors stop using AI, detection rate drops by 6 percentage points
  • Anthropic's own experiment also shows poor results
  • Skill degradation is not science fiction warning but an ongoing reality

After doctors used AI, they missed more tumors when AI assistance was not available. Engineers using AI scored 17 points lower on post-tests. These two figures come from two peer-reviewed studies published in 2026, summarized by Nature on June 21.

The conclusion shows that while AI tools improve short-term efficiency, they are systematically eroding the core abilities of human practitioners. Oslo University medical researcher Yuichi Mori directly states: "There is currently no established solution to counteract skill degradation; this should be the hottest research topic over the next decade."

After doctors stop using AI, detection rate drops by 6 percentage points

The Polish ACCEPT trial selected a quite strict study group: all participating doctors had completed at least 2,000 colonoscopies, qualifying as fully experienced specialists rather than interns. The study design allowed doctors to use AI assistance on certain clinical days, with the system analyzing intestinal images in real-time and automatically marking suspected adenomas; on other days, AI was completely unavailable. The results were published in Gastroenterology, Hepatology & Biliary Science.

Before AI was introduced, this group of doctors had an adenoma detection rate of 28.4%. After AI was introduced, their sessions without AI assistance saw the detection rate fall to 22.4%, a full 6 percentage points lower.

The study points out that continuous use of AI tools can cause clinicians "to become less proactive, less focused, and less responsible for outcomes when making cognitive decisions without AI." U.S. San Francisco-based physician Robert Wachter commented more directly: even highly skilled professionals may gradually regress in their core skills due to reliance on AI tools.

This mechanism is not hard to understand. When AI takes on the task of "finding anomalies" long-term, doctors' attention distribution patterns are retrained; once the scaffolding is removed, the brain accustomed to "waiting for AI to tell me" finds it difficult to automatically switch back to a highly alert state.

Anthropic's own experiment also shows poor results

Anthropic researchers Judy Hanwen Shen and Alex Tamkin published a randomized controlled trial on January 29, 2026. The subjects were 52 junior software engineers, all asked to learn the same new Python package, Trio. Everyone could search online and consult official documentation; half of them were additionally equipped with an AI assistant.

The AI group scored an average of 50%, while the manual coding group scored 67%, a 17 percentage point gap, roughly equivalent to two letter grades in academic evaluation. What about time efficiency? The AI group completed tasks on average about 2 minutes faster, which was not statistically significant. In other words, engineers sacrificed 17 points of understanding depth to gain just 2 minutes of surface speed.

The most severe skill degradation was in debugging ability. Shen and Tamkin emphasized this danger, noting that catching errors generated by AI remains one of the most critical human supervisory functions. If engineers' debugging skills atrophy due to long-term outsourcing to AI, AI's errors become even harder to detect, creating a vicious cycle of deterioration.

The trial also revealed a detail: engineers using AI for "concept exploration" scored over 65%; those who fully outsourced "code generation" to AI scored below 40%. Is AI an exploration tool or a production substitute? The result shows a 25 percentage point difference.

Skill degradation is not science fiction warning but an ongoing reality

These two studies ask not "Is AI useful?" but "How much ability do long-term AI users retain when AI is unavailable?" The answer to this question is beginning to appear in quantifiable data.

Currently, academia has almost no consensus on the "optimal frequency of AI assistance" or on "how to maintain core skills in an AI environment." Mori says this will be the hottest research topic over the next decade, and it doesn't seem exaggerated, as the speed of skill degradation may outpace research efforts.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pinned