Not touching the weights, just relying on external tools to let Kimi soar from 50 to 79—this approach is too reckless.

View Original
MeNews
No weight adjustment, pure API tuning: Poetiq "plugin" boosts Kimi by 29.9 percentage points, lightweight Gemini counterattacks Claude Opus
Poetiq's six-member team’s Meta-System set a new highest score on LiveCodeBench Pro. This pure API plugin improves itself through recursive self-improvement to extract task experience, without touching weights or fine-tuning, significantly enhancing weak models. After integration, KimiK2.6 rose from 50.0% to 79.9%, Gemini3.0 Flash increased by 10 points, even surpassing Gemini3.1 Pro, Claude Opus4.7, GPT5.2 High. GPT5.5 High reached 93.9% through the plugin, Gemini3.1 Pro combined at 90.9%, surpassing Gemini3 Deep Think. Enterprises can improve reasoning capabilities without costly fine-tuning.
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pinned