Fudan Meituan's recent open-source WBench is quite hardcore, testing 289 cases of interactive world models thoroughly, with a correlation of over 0.94 between metrics and human blind tests. Data speaks louder than hype.

View Original
CoinNetwork
Fudan University partners with Meituan LongCat to open-source the interactive world model benchmark WBench
Fudan University and Meituan LongCat jointly open-sourced the WBench interactive world model benchmark, including 289 test cases, 1,058 rounds of interaction, covering first- and second-person perspectives, navigation control, subject actions, event editing, and viewpoint switching. 22 automatic metrics have a correlation coefficient of ≥0.94 with human blind tests. The results show that interactive control is almost decoupled from model rendering, physics/consistency, with hy-world1.5 leading in navigation control, lingbot-world leading in consistency, and matrix-game3.0 ranking first in action navigation.
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pinned