According to Beating Monitoring, the video generation large model company Sand.ai (founded in January 2024) announced the completion of two rounds of financing totaling over 100 million USD. Investors include Look Capital, Lollapalooza Capital (Wang Huiwen’s family office), Jiukun Venture Capital, Matrix Partners China, MSA Capital, Innovation Works, Source Code Capital, IDG, Baidu Venture Capital, and several other leading institutions. This round was financially advised by Xinghan Capital.

Sand.ai founder Cao Yue said in an interview that the team has long adhered to a non-consensus autoregressive (Autoregressive) video generation approach, rather than the mainstream diffusion route. Its previously released Magi-1 model has remained #1 on Google DeepMind’s Physics-IQ physical authenticity test leaderboard.

To break through the “impossible triangle” of video generation—cost, speed, and quality—Sand.ai shifted last year to explore the MoE (Mixture of Experts) architecture. It plans to release a new generation of video generation model adopting the MoE architecture in July 2026 (Q3), balancing efficient inference with the largest parameter scale currently in the open-source field, and will open-source this model.

In terms of commercialization, Sand.ai adopts a dual-engine strategy of models and products. Its music agent product VidMuse, launched in January this year, has already achieved 10 million USD in ARR in just 2 months. In addition, its open-source MagiAttention operator library has been used by nearly all multimodal model teams in China and has received official recommendations from NVIDIA.

Regarding the “world models” concept that has been widely discussed in the industry, Cao Yue believes it is still in the pre-GPT era (before GPT-1 appeared), with neither data nor approaches converged. He pointed out that video is the most important data modality for moving toward world models. Models should enable autonomous learning of physical laws by predicting the raw video observation data (Pixels/Frames), rather than explicitly modeling state variables by introducing human priors.

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

Reward
like
Comment
Repost
Share

Comment

Add a comment

No comments

Trending Topics
View More
#
MyGateTradeStory
1.16M Popularity
#
PredictWorldCup🇫🇷vs🇮🇶
865.09K Popularity
#
TradFiCFDGoldMasters
2.08M Popularity
#
GateProofOfReservesReport
82.58K Popularity
#
TrumpMemeCoinRises7.9%
56.23M Popularity

Pinned

Sitemap

Sand.ai secures funding of over 100 million USD: sticking to an autoregressive video route, planning to release an open-source MoE large model in July

Trending Topics

MyGateTradeStory

PredictWorldCup🇫🇷vs🇮🇶

TradFiCFDGoldMasters

GateProofOfReservesReport

TrumpMemeCoinRises7.9%

Pinned