pmarca shares information about a 3.3 billion-parameter model trained on historical text

robot
Abstract generation in progress

ME News Report, April 3rd (UTC+8), recently, well-known figure pmarca shared information about model pretraining on social media. According to his post, the model’s pretraining corpus used American and British books and newspapers from before January 1, 1900, sourced from Huggingface and Internet Archive. After extensive filtering, approximately 22B tokens were compiled into the training dataset. The article mentions that the best checkpoint of this model is a 3.3 billion parameter model. pmarca stated that he has been looking forward to such developments since December 1, 2022. (Source: InFoQ)

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin