Meta AI releases Joint Embedding Predictive World Model JEPA-WMs for physical planning

robot
Abstract generation in progress

ME News update: On April 3 (UTC+8), the Meta AI Research team released JEPA-WMs, a joint embedding prediction world model for physical planning, along with related research. The study explores the key factors behind the model’s success and provides a complete PyTorch implementation, dataset, and pretrained models. The released models include the core JEPA-WM, as well as baseline models such as DINO-WM and V-JEPA-2-AC(fixed). They cover multiple robot manipulation and navigation environments including DROID & RoboCasa, Metaworld, Push-T, PointMaze, and Wall. The models use visual encoders such as DINOv3 ViT-L/16, DINOv2 ViT-S/14, and V-JEPA-2 ViT-G/16, with input image resolutions mainly of 224×224 or 256×256. The project also provides an optional VM2M decoder head for visualization and trajectory decoding, but emphasizes that this decoder is not necessary for training a world model or for conducting planning evaluations. All resources are publicly available on GitHub, Hugging Face, and arXiv. (Source: InFoQ)

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin