Tencent open-sourced the Hun Yuan World Model 2.0, a one-sentence prompt can generate an explorable 3D world, directly importable into Unity and UE.

robot
Abstract generation in progress

ME News Report, April 16 (UTC+8), according to Dongcha Beating monitoring, Tencent officially released and open-sourced the Metaverse 3D World Model 2.0 (HY-World 2.0).
This is a multimodal world model framework supporting text, single images, multi-view images, and video inputs, with outputs that are not videos but editable 3D assets (mesh models, 3D Gaussian splats, point clouds), which can be directly imported into Unity, Unreal Engine, and NVIDIA Isaac Sim.
Model weights and code are open-sourced on GitHub and Hugging Face.
The fundamental difference from video world models like Genie 3 and Cosmos is: video world models generate pixel-level videos that disappear after playback and cannot be edited; HY-World 2.0 generates persistent 3D assets that support free walking, physical collisions, and secondary editing.
Tencent summarizes this difference in the technical report as “watch a video and it disappears” versus “build a world that is permanently preserved.”
It can render in real-time using consumer-grade GPUs, with inference requiring only one pass, unlike video world models that generate each frame repeatedly.
Technically, it involves four stages: first, use HY-Pano 2.0 to generate a 360-degree panoramic image from input; then, plan the trajectory with WorldNav; next, expand the world along the trajectory with WorldStereo 2.0; finally, reconstruct all generated segments into a unified 3D scene with WorldMirror 2.0.
In the open-source scheme, HY-World 2.0 is claimed to be the first to reach SOTA level among 3D world models, with effects comparable to the closed-source commercial product Marble.
However, currently only the code and weights for WorldMirror 2.0 (the 3D reconstruction module, about 1.2 billion parameters) are open-sourced; the code and weights for panoramic generation, trajectory planning, and world expansion modules are marked as “coming soon.”
For game developers, this means they can quickly generate level prototypes and maps with a single command, saving a lot of manual modeling time.
For embodied intelligence researchers, the cost of generating simulation training environments from photos in bulk is greatly reduced.
Tencent also launched an online experience portal where users can freely explore the generated streets and buildings by controlling a character.
(Source: BlockBeats)

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin