DeepMind Launches AI Math Research Assistant: Multi-Agent Framework Surpasses GPT-5.5 Pro and Solves Previously Unsolvable Problems

According to monitoring by Dongcha Beating, Google DeepMind has released an AI co-mathematician, an interactive research platform for mathematicians utilizing a multi-agent architecture. The system achieved a 47.9% accuracy rate on the currently most challenging research-level math benchmark, FrontierMath Tier 4 (solving 23 out of 48 problems), directly surpassing the previous record of 39.6% set by GPT-5.5 Pro. This system did not use a next-generation foundation model but instead utilized Gemini 3.1 Pro. The model itself achieved only 19% accuracy on Tier 4, but with the addition of the agent framework, its performance more than doubled. DeepMind equipped it with a multi-layer architecture: at the top level, a ‘project coordinator’ breaks down research tasks into multiple workflows, which are then distributed to sub-agents responsible for literature retrieval, coding, and reasoning. The proofs generated must undergo a review process by multiple ‘review agents’ before they can be submitted. This heavy scaffolding demonstrates that the incremental capabilities extracted through orchestration can potentially exceed those gained from upgrading models in top-tier mathematical reasoning. The blind testing was conducted by Epoch AI, and to prevent cheating, the DeepMind team did not see the questions throughout the process, with each question allowed to run for 48 hours. The results not only topped the leaderboard but also solved three problems that had previously stumped all models. Although referred to as an assistant, it functions more like a creative colleague. Group theory expert Marc Lackenby used it in actual research to resolve a public conjecture from the Kourovka notebook. Interestingly, the initial strategy proposed by the system was flagged as ‘flawed’ by its own review agent, but Lackenby recognized the clever idea hidden within the rejected proposal, filled in the gaps himself, and ultimately completed the proof. Currently, the AI co-mathematician is only available for internal testing by a limited number of mathematicians.

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin