Google releases ReasoningBank, enabling intelligent agents to extract reasoning strategies from success and failure experiences

robot
Abstract generation in progress

ME News Report, April 22 (UTC+8), according to Beating Monitoring, Google Research Institute released the reasoning memory framework ReasoningBank, enabling large models-driven agents to continue learning after deployment. The core approach is to distill past task successes and failures into general reasoning strategies stored in a memory bank, so that when encountering similar tasks next time, the agent first retrieves and then executes. The related paper was published at ICLR, and the code has been open-sourced on GitHub.

Previously, two mainstream solutions each had drawbacks: Synapse records complete action trajectories, which are too granular to transfer; Agent Workflow Memory only extracts workflows from successful cases. ReasoningBank made two modifications: changing the storage object from "action sequences" to "reasoning patterns," with each memory containing a structured three-part field: title, description, and content; failure trajectories are also incorporated into learning.

The model calls another large model to self-evaluate the execution trajectory, and failure experiences are broken down into rules to avoid pitfalls, such as upgrading from "clicks Load More button when seen" to "first verify the current page indicator to avoid infinite scrolling, then click load more." The paper also proposes Memory-aware Test-time Scaling (MaTTS), which invests more computing power during inference to repeatedly attempt, and stores the exploration process in memory.

Parallel expansion allows the agent to run multiple different trajectories for the same task, extracting more robust strategies through self-comparison; sequential expansion repeatedly refines within a single trajectory, recording intermediate reasoning into memory.

On the WebArena browser task and SWE-Bench-Verified code task benchmarks, using Gemini 2.5 Flash as the ReAct agent, ReasoningBank outperforms the memoryless baseline with an 8.3% higher success rate on WebArena and 4.6% on SWE-Bench-Verified, with about 3 fewer steps per task on average; adding MaTTS parallel expansion (k=5) further increases WebArena success rate by 3 percentage points and reduces steps by another 0.4.

(Source: BlockBeats)

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pinned