Google releases ReasoningBank, enabling intelligent agents to extract reasoning strategies from success and failure experiences

robot
Abstract generation in progress

CryptoWorld News reports that, according to Beating Monitoring, Google Research has released an intelligent agent memory framework called ReasoningBank, enabling large-model-driven agents to continue learning after deployment. The core approach is to distill both successful and failed experiences from past tasks into general reasoning strategies stored in a memory bank, so that when similar tasks are encountered next time, the system retrieves relevant strategies first and then executes. The related paper was published at ICLR, and the code has been open-sourced on GitHub.

Previously, two mainstream solution types each had their own drawbacks: Synapse records complete action trajectories, but the granularity is too fine to transfer; Agent Workflow Memory only extracts workflows from successful cases. ReasoningBank makes two changes: it replaces the storage object from “action sequences” to “reasoning patterns,” and each memory includes a structured three-part field consisting of a title, description, and content; failed trajectories are also incorporated into learning. During execution, the model calls another large model to self-evaluate the execution trajectories, and failure experiences are broken down into pitfall-avoidance rules—for example, upgrading from “click the Load More button when you see it” to “first verify the current page identifier to avoid getting stuck in infinite scrolling, and then click load more.” The paper also proposes Memory-aware Test-time Scaling (MaTTS), which allocates more compute during inference to try repeatedly, and stores the exploration process in the memory bank.

Parallel expansion enables the agent to run multiple different trajectories for the same task, extracting more robust strategies through self-comparison; sequential expansion repeatedly refines within a single trajectory, recording intermediate reasoning into the memory bank. On the WebArena browser task and SWE-Bench-Verified code task benchmarks, using Gemini 2.5 Flash to build the ReAct agent, ReasoningBank achieves an 8.3% higher success rate on WebArena and a 4.6% higher success rate on SWE-Bench-Verified compared with a memoryless baseline, with about 3 fewer steps on average per task. After adding MaTTS parallel expansion (k=5), the WebArena success rate increases by another 3 percentage points, and the number of steps decreases by another 0.4 steps.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin