ME News reports that on April 23 (UTC+8), according to monitoring by Beating, the UCSB Yu Feng team, in collaboration with fuzz.land and other institutions, proposed AgentFlow, which automatically synthesizes multi-agent harnesses (programs that orchestrate agent role division, information transmission, tool assignment, and retry logic) for vulnerability discovery.
The paper points out that when the model is unchanged, simply changing the harness can make the success rate differ by several times, but existing solutions are mostly manually written or only search local design spaces.
AgentFlow uses a typed graph DSL to unify the five dimensions of the harness (roles, topology, message patterns, tool binding, coordination protocol) into an editable graph program, allowing simultaneous addition/modification of agents, topology, prompts, and tool sets in a single step.
The outer loop identifies failure points from runtime signals such as coverage and sanitizer reports of the target program, replacing the binary pass/fail feedback.
On TerminalBench-2, using Claude Opus 4.6, it achieves 84.3% (75/89), the highest score of its kind on this leaderboard.
On the Chrome codebase (35 million lines of C/C++), the system synthesizes a harness composed of over 300 agents, with automatically evolved agent instructions targeting C++ memory safety vulnerabilities, requiring crash verification with ASAN/UBSAN, and multi-agent deduplication via shared documents and file locks.
Using the open-source model Kimi K2.5 on 192 H100s running for 7 days, it discovered 10 zero-day vulnerabilities, all confirmed by the Chrome VRP.
Six have been assigned CVE IDs, involving WebCodecs, Proxy, Network, Codecs, Rendering, with types including UAF, integer overflow, and heap buffer overflow, among which CVE-2026-5280 and CVE-2026-6297 are Critical-level sandbox escapes.
fuzz.land co-founder Shou Chaofan said some vulnerabilities were initially discovered using MiniMax M2.5, and most can also be found by MiniMax M2.5 and Opus 4.6.
AgentFlow is open-sourced.
(Source: BlockBeats)

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

Reward
like
Comment
Repost
Share

Comment

Add a comment

No comments

Trending Topics
View More
#
Get2SharesOfSKHynixAtZeroCost
1.56M Popularity
#
MicronOvertakesMetaInMarketValue
233.4K Popularity
#
WorldCup🇨🇴vs🇵🇹
344.3K Popularity
#
USMayPCEInflationRisesTo4.1%HighestIn3Years
159.57K Popularity
#
StakeUSD1Earn9.48%APR
983.48K Popularity

Pinned

Sitemap

AgentFlow synthesizes 300 agents to uncover 10 Chrome zero-day vulnerabilities, including sandbox escape.

Trending Topics

Get2SharesOfSKHynixAtZeroCost

MicronOvertakesMetaInMarketValue

WorldCup🇨🇴vs🇵🇹

USMayPCEInflationRisesTo4.1%HighestIn3Years

StakeUSD1Earn9.48%APR

Pinned