Meta AI releases Joint Embedding Predictive World Model JEPA-WMs for physical planning

MeNews · 2026-04-04T19:33:07+00:00

Meta AI Research team has released the JEPA-WMs model for physical planning and related research, providing PyTorch implementation, datasets, and pre-trained models. It covers multiple robotic operation environments and emphasizes the use of optional decoder heads. All resources are publicly available.

MeNews

2026-04-04 19:33:07

Abstract generation in progress

ME News update: On April 3 (UTC+8), the Meta AI Research team released JEPA-WMs, a joint embedding prediction world model for physical planning, along with related research. The study explores the key factors behind the model’s success and provides a complete PyTorch implementation, dataset, and pretrained models. The released models include the core JEPA-WM, as well as baseline models such as DINO-WM and V-JEPA-2-AC(fixed). They cover multiple robot manipulation and navigation environments including DROID & RoboCasa, Metaworld, Push-T, PointMaze, and Wall. The models use visual encoders such as DINOv3 ViT-L/16, DINOv2 ViT-S/14, and V-JEPA-2 ViT-G/16, with input image resolutions mainly of 224×224 or 256×256. The project also provides an optional VM2M decoder head for visualization and trajectory decoding, but emphasizes that this decoder is not necessary for training a world model or for conducting planning evaluations. All resources are publicly available on GitHub, Hugging Face, and arXiv. (Source: InFoQ)

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.