Former ByteDance Seed Engineer: ByteDance's iteration cycle takes half a year, while rumors suggest Google only needs three months

robot
Abstract generation in progress

According to Beating Monitoring, former ByteDance Seed team engineer and now Peking University assistant professor Zhang Chi revealed in the podcast “Into Asia” that it takes about half a year for ByteDance to complete a large model training (pre-training plus post-training), while Google is rumored to only need three months. He believes that the speed of iteration is one of the core reasons why Chinese companies are struggling to catch up. Zhang Chi has been with ByteDance for about a year; his math team is more research-oriented, and he describes their positioning as “more for publicity,” different from the pre-training and post-training teams responsible for model delivery.

Zhang Chi described the internal culture of Seed as benchmaxxing: team leaders evaluate performance based on the responsible benchmark, and everyone is racing for scores, “but this can’t translate into a good experience in actual use.” He said that on paper, models from Chinese major companies can match the cutting-edge models in the U.S., but in practice, they “are not good enough.” Seed’s goal is to be world-class, “but unfortunately, I don’t think we have caught up,” even the goal of being the top domestically “has not been achieved.” By the end of 2024, Seed believes it will catch up with GPT-4o, and then DeepSeek was released. The team realized the gap still exists, and when he joined, the entire team was urgently shifting towards reinforcement learning.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin