Anthropic Report Responds to Self-Evolution: Partial Closed-Loop Achieved, but Still Far from Fully Autonomous Training

robot
Abstract generation in progress
According to Beating Monitoring, the AI autonomous iteration capability is surpassing everyone's expectations. The Anthropic Institute released a report titled "When AI Builds Itself" on June 5th, detailing their progress in "recursive self-improvement." Data shows that by May 2026, over 80% of the code merged into Anthropic's main codebase was written by Claude itself. Before the release of Claude Code in February 2025, code written by Claude accounted for only single digits. Tang Jie, founder of Zhipu AI, predicted on May 13th that the ultimate goal of large models is self-evolution, and Claude may have already achieved a self-training baseline involving "writing code, cleaning data, and training itself." However, Anthropic clarified in the report that fully autonomous design and development of successors through recursive self-improvement has not yet been realized. The role of AI in the development chain is transitioning from partial efficiency improvements to autonomous decision-making. In Q2 2026, the average daily code merge per engineer at Anthropic reached eight times that of 2024. The current development process is simple: engineers only plan goals and review, while Claude handles specific coding and execution. Anthropic also deployed Claude as an automatic code reviewer, responsible for intercepting bugs and security vulnerabilities. This indicates that the "self-judgment" pillar pointed out by Tang Jie has been implemented on the engineering side, but human review remains the final safety valve. The reliability of models independently executing long-term tasks has also doubled. The duration that models can continuously work autonomously roughly doubles every four months. In March 2024, Claude 3 Opus could only handle simple tasks for 4 minutes. One year later, Claude 3.7 Sonnet could sustain 1.5 hours. By March 2026, Claude 4.6 Opus was capable of handling complex tasks for 12 hours. Data from evaluation agency METR shows that the latest Claude Mythos preview version can work autonomously for over 16 hours, approaching the upper limit of current evaluation tools. At the current pace, by 2027, AI will be able to autonomously handle research tasks that would take humans weeks, helping companies leap from "one-person companies" to "unmanned companies" (unmanned companies). As for Tang Jie's speculation about the "self-training baseline," the report actually reveals a partial "miniature experimental closed loop." In experiments speeding up small model training code, Claude 4 Opus in May 2025 only achieved a 3x speed increase, while the Claude Mythos preview in April 2026 achieved a 52x acceleration. In comparison, top human researchers typically achieve a 4x improvement within 4 to 8 hours. However, the optimization goals and success metrics of these experiments are set by humans in advance. When facing a more complex end-to-end chain of "data cleaning, synthetic data generation, and self-training," AI's decision-making ability still falls short. Nonetheless, the autonomous closed loop of the development chain is pushing humans toward the edge of losing ultimate system control. Tang Jie's predicted "LLM OS replacing traditional architectures and applications being generated on-demand in real-time" means that future computer operations will run dynamic code that cannot be pre-reviewed; while the warning from Anthropic that "human review cannot keep up with AI self-evolution" implies that we may not even be able to oversee the source of generated code. Once AI begins autonomously designing and training successors, software evolution will become a complete black box. Allowing AI to perform un-audited self-iterations within a black box system will make subsequent safety isolation, monitoring, and behavior alignment of self-improvement systems extremely challenging.
View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pinned