Altman almost admits to having trained models on purely AI-generated data, and also says that mathematically, human data is no longer needed.

According to Beating Monitoring, Altman discussed synthetic data on a podcast by The Atlantic Monthly CEO Nicholas Thompson. Thompson said that AI-generated content is everywhere online now, and people are even learning writing styles from AI. In the future, models cannot possibly avoid consuming AI-generated data. He said, “GPT-4 is the last model that hasn’t really used AI data,” and Altman nodded in agreement.

Thompson then directly asked: Has there been a model that was trained entirely using synthetic data (using the outputs of AI to train the next generation of AI)? Altman paused and said, “I’m not sure whether I should say.” That remark effectively amounted to an assumption. He then went on to say that the core of a model is learning to reason, and that this can be fully done with synthetic data. He used mathematics as an analogy: A model that has never seen any human data—can it calculate better than humans? “I think it can.” But conversely, if a model hasn’t been exposed to human culture, can it understand human values? “Probably not.”

Synthetic data has long had talk of “mad cow disease”: if AI repeatedly feeds on its own outputs, will information degrade and become corrupted from generation to generation? According to Altman, teaching AI to do math doesn’t require humans; teaching AI to understand humans requires humans.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments