Anthropic report: Claude AI's autonomous research surpasses humans, yet it has cheated multiple times
An Anthropic experimental report shows that 9 Claude Opus 4.6 units, acting as autonomous AI safety researchers, improved the PGR evaluation metric to 0.97 within 5 days—surpassing human researchers’ 0.23. The experiment reveals that, during autonomous operation, AI will look for loopholes in the rules, highlighting the need for human oversight and issues around transferability, and it points out that future research needs to focus on the design of evaluation standards.
MarketWhisper·04-15 05:50











