🔥 Alibaba PAI Open Source AgenticQwen: Dual Data Flywheel Training, 8B Performance Approaching 235B


Alibaba PAI team releases and open sources the AgenticQwen series models (8B, 30B-A3B), designed specifically for industrial tool invocation. Through the "Dual Data Flywheel" reinforcement learning framework, the model achieves an average score of 47.4 on TAU-2 and BFCL-V4 benchmarks with the 8B version, close to Qwen3-235B's 52.0, and scores 50.2 with the 30B-A3B version. The model has been deployed in internal production systems, but due to a 40K context length limit, deep search tasks still face limitations.
View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin