According to Beating monitoring, Zhang Chi from ByteDance's Seed team stated that large model training takes about half a year, with rumors from Google in March, and the iteration speed becoming a key factor in catching up. The Seed math team focuses on research and positioning publicity, while the actual delivery team is different. Internally, there is benchmaxxing aimed at achieving high scores, with paper-based parity with global models but lacking user experience. Seed claims to be top in the world but has not yet achieved it; by the end of 2024, they aim to match GPT-4o but still find gaps, shifting to reinforcement learning to narrow the distance.

BlockBeatNews

2026-04-24 08:51:03

Abstract generation in progress

According to Beating Monitoring, former ByteDance Seed team engineer and now Peking University assistant professor Zhang Chi revealed in the podcast “Into Asia” that it takes about half a year for ByteDance to complete a large model training (pre-training plus post-training), while Google is rumored to only need three months. He believes that the speed of iteration is one of the core reasons why Chinese companies are struggling to catch up. Zhang Chi has been with ByteDance for about a year; his math team is more research-oriented, and he describes their positioning as “more for publicity,” different from the pre-training and post-training teams responsible for model delivery.

Zhang Chi described the internal culture of Seed as benchmaxxing: team leaders evaluate performance based on the responsible benchmark, and everyone is racing for scores, “but this can’t translate into a good experience in actual use.” He said that on paper, models from Chinese major companies can match the cutting-edge models in the U.S., but in practice, they “are not good enough.” Seed’s goal is to be world-class, “but unfortunately, I don’t think we have caught up,” even the goal of being the top domestically “has not been achieved.” By the end of 2024, Seed believes it will catch up with GPT-4o, and then DeepSeek was released. The team realized the gap still exists, and when he joined, the entire team was urgently shifting towards reinforcement learning.

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

Reward
like
Comment
Repost
Share

Comment

Add a comment

No comments

Trending Topics
View More
#
WCTCTradingKingPK
154.36K Popularity
#
CryptoMarketSeesVolatility
219.66K Popularity
#
rsETHAttackUpdate
67.26K Popularity
#
US-IranTalksStall
175.84K Popularity
#
ETHMemeCoinFLORKSurges
35.72K Popularity

Sitemap

Former ByteDance Seed Engineer: ByteDance's iteration cycle takes half a year, while rumors suggest Google only needs three months

Trending Topics

WCTCTradingKingPK

CryptoMarketSeesVolatility

rsETHAttackUpdate

US-IranTalksStall

ETHMemeCoinFLORKSurges

Pin