ME News, April 23 (UTC+8), Mila announced that its researchers will present 70 papers at ICLR 2026 (Brazil). Highlights from the first day include: in model merging and fine-tuning, DisTaC achieves robust model merging through distillation of conditional task vectors; one study uses epsilon scheduling to mitigate the suboptimal transfer issue when fine-tuning non-robust pretrained models, and an oral presentation reveals the effectiveness of a single global merging strategy in decentralized learning; in the field of graph learning, GraphOmni proposes a benchmark framework for evaluating large language model performance on graph theory tasks, and another work clarifies the misunderstanding about Transformer oversmoothing; in reinforcement learning, SHAPO introduces sharpness-aware optimization for safe exploration, ARM-FM uses foundation models to automatically generate reward machines, hierarchical value decomposition offline reinforcement learning methods are applied to whole-body control, and Asymmetric Proximal Policy Optimization improves large language model reasoning ability through a small critic; in generative models, Efficient Regression-based Training of Normalizing Flows for Boltzmann Generators proposes an efficient regression training method, FALCON achieves few-step exact likelihood computation for continuous flows, and Contractive Diffusion Policies enhance the robustness of action diffusion through contractive score sampling. Regarding large language models: Landscape of Thoughts visualizes the reasoning process, Model Collapse has been redefined as a feature of machine forgetting rather than a defect, Beyond Multi-Token Prediction uses future summary pretraining, and Visual symbolic mechanisms explore symbolic processing in vision-language models. Other highlights include the high-resolution tropical tree canopy detection dataset SelvaBox, computationally efficient meta-generalization of learned optimizers µLO, the efficient modular library TGM for temporal graphs, and Robust Reward Modeling, which improves the robustness of reward modeling through causal rules. (Source: InFoQ)

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

Reward
like
Comment
Repost
Share

Comment

Add a comment

No comments

Trending Topics
View More
#
Get2SharesOfSKHynixAtZeroCost
1.64M Popularity
#
MicronOvertakesMetaInMarketValue
353.5K Popularity
#
WorldCup🇿🇦vs🇨🇦
129K Popularity
#
USMayPCEInflationRisesTo4.1%HighestIn3Years
195.96K Popularity
#
StakeUSD1Earn9.48%APR
1M Popularity

Pinned

Sitemap

Mila presents 70 papers at ICLR 2026, covering frontiers such as model merging and graph learning.

Trending Topics

Get2SharesOfSKHynixAtZeroCost

MicronOvertakesMetaInMarketValue

WorldCup🇿🇦vs🇨🇦

USMayPCEInflationRisesTo4.1%HighestIn3Years

StakeUSD1Earn9.48%APR

Pinned