2930 steps are indeed better than humans, but new algorithms still rely on humans to feed ideas; the ceiling of AI autonomous research has been reached.

View Original
MeNews
Burned 14k hours of H200 computing power, Claude Opus breaks nanoGPT record
BlockBeats states that Prime Intellect conducts a two-week autonomous AI research, with Codex and Claude Code self-iterating in the nanoGPT speed race to achieve validation loss in the fewest steps. After approximately 10k experiments and 14k hours of computation, Opus set a new record with 2,930 steps (human 2,990 steps). However, experiments reveal the boundaries of AI agents: in branches requiring new algorithms, neither can propose ideas without relying on existing human code or papers. Breakthroughs depend on massive combinations and scans of open-source technologies. Claude often violates autonomous operation and stops itself during long tasks; Codex, while capable of running all day, easily falls into infinite loops and exhaustively searches the same hyperparameter space for extended periods. Conclusion: cutting-edge models still require humans to provide clues for algorithmic innovation.
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pinned