ME News Report, May 16 (UTC+8), according to Beating Monitoring, ByteDance's Seed team has open-sourced Cola DLM.
This is a set of continuous latent diffusion language models that attempt to bypass the fixed token-by-token generation path of large language models, changing text generation to first organize high-level semantics and then revert to specific words.
The core of Cola DLM is Text VAE + block-causal DiT.
Text VAE first maps discrete text into a continuous latent space, and block-causal DiT then learns the latent prior through Flow Matching.
Finally, a conditional decoder restores the latent variables back into text.
The diffusion process handles latent semantic representations, not repeatedly denoising directly at the token level.
This open-source version is a 2B-level model, with approximately 2.3 billion total parameters, including a core DiT with 1.8 billion parameters and an additional 500 million parameters for VAE.
In evaluations such as LAMBADA, MMLU, OBQA, HellaSwag, RACE, SIQA, SQuAD, and Story Cloze, the paper states that under a unified generative evaluation protocol, it has demonstrated scaling performance competitive with baseline models of the same size like AR / LLaDA, and achieved the best results in the final average score.
However, it is currently still a research checkpoint, not a directly usable dialogue model.
The official note states that this model has not undergone instruction fine-tuning or RLHF, and its main purpose is to study how continuous latent diffusion can be used for text generation.
The paper also shows preliminary experiments extending to unified modeling of text and images, but this open-source repository only includes the text pipeline.
(Source: BlockBeats)

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

10 Likes

Reward
10
3
1
Share

Comment

Add a comment

BreadthHunter

· 7h ago

Eight evaluation items level with AR, but without RLHF, it might still fall a bit short in actual use.

View OriginalReply0

VineGeometry

· 7h ago

Is the block-causal design intended for long texts or efficiency? Please elaborate in the paper.

View OriginalReply0

GateUser-a4680931

· 7h ago

Does diffusion at the latent semantic layer produce higher quality results than AR? Waiting for actual measurements.

View OriginalReply0

Trending Topics
View More
#
StockTradingChallengeUpTo17000U
16.02M Popularity
#
TrumpBacksCFTCAuthorityOverPredictionMarkets
834.17K Popularity
#
GatePredictionMarketAddsSmartMoneyTracking
13.25M Popularity
#
MicronMarketCapBreaks1Trillion
46.17K Popularity
#
TradeCFDWinGold
3.09M Popularity

Pinned

Sitemap

ByteDance open-sources Cola DLM: Redefining text generation with diffusion models

Trending Topics

StockTradingChallengeUpTo17000U

TrumpBacksCFTCAuthorityOverPredictionMarkets

GatePredictionMarketAddsSmartMoneyTracking

MicronMarketCapBreaks1Trillion

TradeCFDWinGold

Pinned