Stanford's conclusion is quite sobering: opening weights is just the starting point; data barriers are the real moat.

View Original
MeNews
Stanford NLP: Most publicly available agent training data still concentrates on the post-training phase
Stanford NLP team on Twitter stated that the publicly available agent training data is mainly used for the fine-tuning stage, especially for models like Qwen. These models may have already been trained on large amounts of agent data. They believe that the amount of agent data needed to train excellent open-source models from scratch far exceeds the scale of fine-tuning solely based on open weights, highlighting the insufficiency of agent data during the pretraining phase. Source: InFoQ
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pinned