First AI programmer index released: Cursor narrowly beats Codex to top with Opus 4.7

robot
Abstract generation in progress

CoinWorld News reports that the artificial intelligence analysis platform has released the first comprehensive benchmark index for coding agents (coding agent index). The index integrates three test areas—code generation, terminal operations, and technical Q&A—to evaluate the real engineering performance of AI programmers. In the first round of evaluation, Cursor CLI paired with the Opus 4.7 model scored 61 points to take the top spot, beating OpenAI’s Codex (paired with GPT-5.5) and Anthropic’s Claude Code (paired with Opus 4.7) by a margin of 1 point. Using the Opus 4.7 model as well, Cursor CLI’s score was slightly higher than the official Claude Code, but the trade-off was longer average task time (7.8 minutes vs. 5.8 minutes), and higher API call costs ($1.47 vs. $1.24). The most cost-effective option is Cursor’s built-in Composer 2, at just $0.07 per task. DeepSeek V4 Pro and Kimi K2.6 follow closely, but these domestic models take noticeably longer to run.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin