Grok 4.2 just hit 60% on the ARC AGI 2 benchmark. Pretty solid performance there. Looks like we're watching a new state-of-the-art moment unfold in AI capabilities. The progress on these standardized benchmarks keeps pushing the boundaries of what these models can handle.

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • 5
  • Repost
  • Share
Comment
0/400
LiquidationHuntervip
· 1h ago
60%? That's just the beginning, still have to keep pushing forward.
View OriginalReply0
SnapshotLaborervip
· 10h ago
60% huh, this number looks pretty good but not that outrageous... Anyway, these benchmarks don't really mean much; what's important is how it performs in actual use.
View OriginalReply0
ForkInTheRoadvip
· 10h ago
60%? Feels not as explosive as I imagined... I thought it could break 70.
View OriginalReply0
MEV_Whisperervip
· 10h ago
NGL, the ARC benchmark has been refreshed again, but does this 60% really mean anything? It feels like these rankings are still worlds apart from actual applications...
View OriginalReply0
NeonCollectorvip
· 10h ago
60%? How much of that benchmark is just fluff... True AGI is still a long way off.
View OriginalReply0
  • Pin

Trade Crypto Anywhere Anytime
qrCode
Scan to download Gate App
Community
  • 简体中文
  • English
  • Tiếng Việt
  • 繁體中文
  • Español
  • Русский
  • Français (Afrique)
  • Português (Portugal)
  • Bahasa Indonesia
  • 日本語
  • بالعربية
  • Українська
  • Português (Brasil)