ME News report, April 22 (UTC+8). According to Dongcha Beating Monitoring, Hugging Face has open-sourced ml-intern, an ML research agent that can autonomously complete the entire workflow of “reading papers, organizing datasets, launching GPU training, evaluating results, and iterating for improvement.” The project is built on its own smolagents framework, providing two entry points—both a CLI and a web interface—with the code open-sourced on GitHub.

ml-intern’s toolchain is built around the Hugging Face ecosystem: it searches for papers on arXiv and HF Papers and performs deep reading along citation chains; it browses datasets on the HF Hub, checks their quality, reformats them, and then uses them for training; when no GPU is available locally, it can call HF Jobs to start cloud training jobs. After training finishes, it automatically reads the evaluation outputs, diagnoses the reasons for failures, and reruns.

By default, it uses Claude Sonnet 4.5 to drive the decision-making loop, with up to 300 iterations per run. Contexts exceeding 170k tokens are automatically compressed.

In its release post, Hugging Face provided three case studies. For scientific reasoning tasks, the agent found the OpenScience and NemoTron-CrossThink datasets from the citation chain of benchmark papers, filtered out 7 variants from ARC, SciQ, and MMLU by difficulty, and ran 12 rounds of SFT on Qwen3-1.7B. The GPQA score increased from 10% to 32%, and it took less than 10 hours.

In medical scenarios, the agent determined that the existing datasets were not of sufficient quality, wrote scripts to generate 1,100 synthetic data points, expanded the dataset 50 times for training, and exceeded Codex by 60% on HealthBench.

In competitive math scenarios, the agent wrote its own GRPO training scripts and launched training on A100 via HF Spaces. After observing reward collapse, it ran ablation experiments to identify the causes.

(Source: BlockBeats)

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

Reward
like
Comment
Repost
Share

Comment

Add a comment

No comments

Trending Topics
View More
#
MyGateTradeStory
1.1M Popularity
#
USIranTalksPostponed
20.41M Popularity
#
PredictWorldCup🇪🇸vs🇸🇦
941.68K Popularity
#
TradFiCFDGoldMasters
2.07M Popularity
#
HoldUSD1EarnYield
102.83K Popularity

Pinned

Sitemap

Hugging Face open-source ml-intern, an ML research agent that automatically reads papers, selects data, and runs training

Trending Topics

MyGateTradeStory

USIranTalksPostponed

PredictWorldCup🇪🇸vs🇸🇦

TradFiCFDGoldMasters

HoldUSD1EarnYield

Pinned