Cursor discloses the "bootstrap" training method: using the old Composer to set up the environment for the new model, Terminal-Bench increases by 14 points

robot
Abstract generation in progress

According to Beating monitoring, Cursor revealed a training trick for the Composer series models: using the previous generation model to automatically build a runnable environment for reinforcement learning (RL) of the next generation. When training Composer 2, Cursor used Composer 1.5 to complete this task, calling it autoinstall. RL training requires a runnable code environment. If the environment is not set up properly, the model wastes tokens on bug fixing and cannot learn effectively; in extreme cases, if the environment fails completely, the entire training compute is wasted. autoinstall solves this problem in two steps: first, an agent reads the codebase documentation and configuration, and proposes 10 verification commands with expected outputs; second, another agent takes 3 of these commands and sets up the environment from scratch until the commands run successfully. The second step retries up to 5 times; if all fail, the environment is discarded. During environment setup, the agent actively fills missing dependencies: fabricating database tables, creating MinIO configurations as a substitute for S3, launching Docker containers as sidecar services, and even generating placeholder images. The blog post uses the blockchain project celo-org/celo-monorepo as an example to demonstrate the entire process, where after the first environment setup failure, the second round creates mock users to bypass authentication, ultimately passing the tests. Composer 2 scored 61.7% on Terminal-Bench (a benchmark for testing the model building and development environment capability), nearly 14 percentage points higher than Composer 1.5’s 47.9%. Cursor states that future plans include involving the older Composer in more training stages, including data preprocessing, runtime management, and architecture tuning.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin