Pay with

USD

Supports Visa, Mastercard, SEPA & more

Flexible trading, zero fees

Use your crypto for payments worldwide

Basic

Trade crypto freely

Magnify your profit with leverage

Convert & Auto-Invest

Trade any size with no fees and no slippage

Get exposure to leveraged positions simply

Pre-Market Trading

Trade new tokens before listing

Advanced

Trade on-chain with Gate Wallet

Smart access to new on-chain tokens

Smart strategies with automated trading

Follow expert trading strategies

CrossEx Trading

One margin balance, shared across platforms

Access hundreds of perpetual contracts

One platform for global traditional assets

Trade European-style vanilla options

Unified Account

Maximize your capital efficiency

Introduction to Futures Trading

Learn the basics of futures trading

Join events to earn rewards

Use virtual funds to practice risk-free trading

U.S. stock CFD derivatives

High leverage, 24/7 trading

Tokenized Stocks

Backed by real stock assets

Mint GUSD for Treasury RWA yields

Launch

Collect candies to earn airdrops

Quick staking, earn potential new tokens

Hold GT and get massive airdrops for free

Unlock full access to global stock IPOs

Trade on-chain assets and earn airdrops

Earn futures points and claim airdrop rewards

Investment

Earn interest with idle tokens

Auto-invest on a regular basis

Dual Investment

Profit from market volatility

Earn rewards with flexible staking

Pledge one crypto to borrow another

One-stop lending hub

Premium wealth growth plans

Private Wealth Management

Premium asset allocation

Top-tier quant strategies

Stake cryptos to earn in PoS products

No-liquidation leverage

Post, share, and explore crypto trends

Live crypto market analysis

Chat with crypto traders

What is happening in crypto

More

Promotions

Activity Center

Participate in activities to earn rewards

Invite friends to earn referral rewards

Affiliate Program

Earn exclusive commission rewards

Grow influence and earn airdrops

Real-time platform updates

Crypto industry articles

Huge fee discounts

Asset Management

One‑stop asset management solution

Enterprise digital asset solutions

Developers (API)

Connects to the Gate application ecosystem

OTC Bank Transfer

Deposit and withdraw fiat

Generous API rebate mechanisms

AI

Your all-in-one conversational AI partner

Use Gate AI directly in your social App

Gate Blue Lobster, ready to go

Gate for AI Agent

AI infrastructure, Gate MCP, Skills, and CLI

Gate Skills Hub

From office tasks to trading, the all-in-one skill hub makes AI even more useful.

Smartly choose from 40+ AI models, with 0% extra fees

Others

Find FAQs and help guides

Learn about crypto investing

Grow with the champions

Proof of Reserves

Gate promises 100% proof of reserves

Keep your assets secure

The Berkeley team announced that it has cracked 8 major agent evaluation benchmarks and open-sourced open-source tools

2026-05-31 18:33:03

ME News Report, April 19 (UTC+8), the Berkeley Artificial Intelligence Research Group (berkeley_ai) quoted Dawn Song's statement, announcing that her team has successfully broken through 8 major agent evaluation benchmarks. The team has decided to open source the tools used to achieve this result and named it BenchJack. The tool is described as "penetration testing for evaluations," aimed at helping other developers proactively test and identify potential weaknesses in their evaluation systems. (Source: InFoQ)

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

7 Likes

Reward
7
7
2
Share

Comment

Add a comment

Add a comment

GateUser-46033407

· 3h ago

Dawn Song is indeed solid at the intersection of security and AI, and this time she hit the nail on the head again.

View OriginalReply0

GateUser-f2d5f4c0

· 4h ago

Open-source tools are more valuable than papers, at least allowing everyone to verify whether the benchmarks are reliable.

View OriginalReply0

ThePatienceRequiredFor

· 4h ago

All eight mainstream benchmarks have been broken, and I feel that the moat for agent eval is shallower than I imagined.

View OriginalReply0

GovernanceVotingTug-Of-WarKing

· 4h ago

The concept of penetration testing for evaluation is quite new; previously, it was all about testing models, now it's about testing the questions themselves.

View OriginalReply0

NeonIceMelt

· 4h ago

Dawn Song's team this move is very Berkeley, attacking first and then open-sourcing, a typical academic hacker vibe.

View OriginalReply0

DustyAlpha

· 4h ago

berkeley_ai comes out swinging with tough moves—can’t wait to see exactly how they bypass these evaluations.

View OriginalReply0

Wax-SealedPrivateKey

· 4h ago

BenchJack, this name is a bit interesting; the evaluation system also needs its own penetration testing.

View OriginalReply0

Trending Topics
View More
#
WinGoldBarsWithGrowthPoints
1.25M Popularity
#
WTICrudeFallsBelow90Dollars
1.21M Popularity
#
StockTradingChallengeUpTo17000U
214.76K Popularity
#
USIranNegotiationGame
9.36M Popularity
#
TradeCFDWinGold
3.21M Popularity

Pinned