NVIDIA open-sources dual-tower AI model, achieving 2.42x text generation speed improvement and 98.7% image quality preservation.

ME AI News: Nvidia has released the Nemotron-Labs-TwoTower discrete diffusion language model, addressing the pain point of slow token-by-token generation in large models. The weights have been open-sourced on Huggingface. The model reuses pre-trained weights from existing backbone networks, eliminating the need for training from scratch and significantly reducing costs. It adopts a 60B dual-tower architecture with two 30B networks working in parallel and coordination, with each tower activating 3B parameters and equipped with 128 routable expert modules to improve generation efficiency. (Source: MLion)
View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pinned