Beating monitoring indicates that large model competition is shifting from the Chat era dominated by pretraining to the Agent era focused on subsequent training, with the core being the scaling of reinforcement learning on agents. Computing power allocation has changed from 3:5:1 (research/pretraining/post-training) to 3:1:1, with investments in pretraining and post-training becoming comparable, and top teams achieving a 1:1 ratio. Infrastructure has also shifted toward heterogeneous cluster scheduling centered on agents, and must tolerate uncertainties and interruptions in complex workflows.

BlockBeatNews

2026-04-24 06:06:14

Abstract generation in progress

According to Beating Monitoring, Luo Fuli, head of Xiaomi’s large model team, pointed out that the competition for large models has shifted from the Chat era dominated by pre-training to the Post-train dominated Agent era. The current key focus is “how to effectively scale reinforcement learning (RL) on Agents.”

This paradigm shift directly leads to a reconstruction of computing resource allocation. Luo Fuli revealed that during the Chat era, the ratio of computing power used for research, pre-training, and post-training was approximately 3:5:1; whereas in the current Agent era, a reasonable allocation ratio has become 3:1:1, meaning the investment in pre-training and post-training has become roughly equal, with top-tier model teams now investing equally in both.

At the same time, the requirements for system architecture have also undergone a major change. Previously, RL infrastructure mainly centered around the “model inference engine” to handle pure text computations; now, the infrastructure must be centered around the “Agent,” supporting heterogeneous cluster scheduling and tolerating the ambiguity caused by various uncontrollable factors that may interrupt Agents during complex workflows.

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

Reward
like
Comment
Repost
Share

Comment

Add a comment

No comments

Trending Topics
View More
#
WCTCTradingKingPK
176.95K Popularity
#
CryptoMarketSeesVolatility
240.44K Popularity
#
rsETHAttackUpdate
79.04K Popularity
#
US-IranTalksStall
191.99K Popularity
#
ETHMemeCoinFLORKSurges
42.57K Popularity

Sitemap

Luofu Li: Large models enter the post-training era, with top teams' pre-training and post-training computing power ratio reaching 1:1

Trending Topics

WCTCTradingKingPK

CryptoMarketSeesVolatility

rsETHAttackUpdate

US-IranTalksStall

ETHMemeCoinFLORKSurges

Pin