Japanese AI Unicorn Launches Sakana Fugu: Automatic Calls to Multiple Models Comparable to Claude Mythos? Performance Scores and Pricing at a Glance

Multiple AI models working collaboratively, presenting only a single API externally, Sakana AI officially released Sakana Fugu on June 22, 2026, a system that automatically assigns tasks to multiple specialized agents through reinforcement learning-trained coordination models.
(Background summary: Anthropic was "blocked" by the U.S. government and withdrew the Fable model, with foreign media citing three major concerns: potentially aiding China in open-source AI)
(Additional background: Elon Musk transforms into a compute arms dealer! SpaceX signs a $6.3 billion reflection deal to lease Nvidia GB300 to support open-source AI)

Table of Contents

Toggle

  • How the Commander Model Works
  • Top-tier Models with Limited Access
  • Pricing Structure and Market Restrictions

Multiple top AI models operate simultaneously, but only one API needs to be called. This is the core gamble of Sakana AI with Fugu. On Monday (22nd), Japan’s AI research lab Sakana AI officially launched Sakana Fugu.

Positioned as "replacing a single model with a system": a framework that automatically commands multiple specialized agents to work together, exposing only a single OpenAI-compatible standard API interface externally. Users do not need to know how many models are running behind the scenes, nor do they need to manually design collaboration workflows—all handled by the internal command mechanism of Fugu.

How the Commander Model Works

Fugu’s underlying architecture has two innovations: TRINITY and Conductor.

TRINITY designs a triangular division of labor: tasks are broken down into three roles—"Thinker" responsible for planning, "Worker" responsible for execution, and "Verifier" responsible for identifying flaws.

These three roles are assigned to different LLMs, forming a balanced workgroup. Simply put: it prevents the same model from both devising solutions and critiquing answers.

Conductor is the core of the entire system, a 7-billion-parameter coordination model trained with reinforcement learning, responsible for deciding which agents to call for each task, how they communicate, and how to integrate the final output. This model does not rely on pre-designed workflows but learns to explore the most effective collaboration paths through training. Sakana calls this an "intuitive yet highly efficient collaboration mode."

The composition of the agent pool can be flexibly adjusted. The Standard tier allows enterprise users to exclude specific vendors or models to meet data privacy or compliance requirements. For organizations that cannot allow data to leave their premises, this is a key differentiating feature.

Top-tier Models with Limited Access

Sakana uses four benchmarks to compare Fugu with cutting-edge models.

  • SWE Bench Pro (software engineering code repair ability): Fugu 59.0, Fugu Ultra 73.7
  • LiveCodeBench (real-time coding competition): Fugu 92.9, Fugu Ultra 93.2
  • GPQA Diamond (interdisciplinary graduate-level Q&A, close to PhD qualifying exams): Fugu and Fugu Ultra both 95.5
  • Humanity’s Last Exam (a highly difficult question bank designed by top global scholars): Fugu 47.2, Fugu Ultra 50.0

Sakana claims these figures "match Mythos Preview and Fable 5 in rigorous benchmarks," but third-party verification is still pending.

Pricing Structure and Market Restrictions

Fugu offers three subscription tiers: Standard at $20/month, Pro at $100/month (10x usage), and Max at $200/month (20x usage). All tiers include access to both Fugu and Fugu Ultra.

Additionally, enterprise token-based billing options are available. Fugu Ultra costs $5 per million input tokens and $30 per million output tokens; for long-context scenarios exceeding 272,000 tokens, rates are adjusted to $10 input and $45 output.

A notable billing logic: Sakana emphasizes that calling more agents collaboratively in a task does not mean costs increase proportionally. The pricing is based on the highest-tier model in the active agent pool, calculated with a single blended rate. In other words, adding a second or third agent does not double the bill, offering a clear cost advantage over integrating multiple APIs independently for complex tasks.

Currently, the most explicit restriction is geographic: Fugu is not available to users in the European Union and European Economic Area (EEA). The official reason is ongoing GDPR compliance certification, with no fixed timeline. Early users subscribing before July 2026 will receive a second month free.

Running multiple models collaboratively outperforms a single model—this is not a new proposition from Sakana. What they are truly advocating is that every agent within the commander architecture can be replaced, so the system’s ceiling is not locked to any single vendor.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments