NVIDIA Physics AI Inference Model Cosmos-Reason2 Open Source 32B Flagship Weights

According to Beating Monitoring, NVIDIA has released the weights for the Cosmos-Reason2-32B model. Cosmos Reason 2 is a physical AI reasoning visual language model (VLM) released by NVIDIA at the end of last year, which processes images, videos, and text simultaneously. It is specifically designed to help robots and autonomous driving systems understand spatial, temporal, and fundamental physical laws. At the time, only two smaller versions with 2 billion and 8 billion parameters were available; the flagship version with 32 billion parameters is now publicly available for the first time. The base model is Qwen3-VL-32B-Instruct from Tongyi Qianwen, licensed under NVIDIA Open Model License for commercial use.

Provide it with a driving video, and it can watch and reason in real-time to determine if a right turn is safe; give it a warehouse photo, and it can mark the 2D/3D coordinates and bounding boxes of each item. Its main applications are threefold: analyzing city and industrial scene video streams, batch labeling sensor data, and serving as the planning brain for humanoid robots and autonomous vehicles. Compared to the previous generation, it has added object detection and precise timestamp localization, with the context window expanded to 256K tokens.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin