One-seventh of the parameters outperform the previous generation, pretraining can achieve cross-domain generalization using only monitoring metrics and synthetic data—data efficiency surprises me more than model size.

View Original
MeNews
Time series forecasting finally runs through the Scaling Law, Datadog open-sources the largest 2.5B parameter model Toto 2
Datadog announces the open-source time series forecasting model Toto family, with five versions: 4m, 22m, 313m, 1B, 2.5B, all under Apache 2.0. Toto 2 is the first to verify the scaling law in the time series domain, the larger the scale, the stronger the prediction, with 2.5B still unsaturated; it wins on BOOM, GIFT-Eval, and TIME benchmarks. Introducing continuous graph block masking, converting autoregression to unidirectional feedforward, significantly speeds up, with 313m latency approaching Chronos-2's 120m. Pretraining only uses system monitoring metrics and synthetic data, yet still demonstrates cross-domain generalization; the 22m version beats Toto 1.0 with only one-seventh of the parameters.
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pinned