Anthropic, together with 12 industry giants, launched the "Glass Wing" project to patch global software vulnerabilities using the unreleased, most powerful model Mythos.

Author: Anthropic

Translation: Deep Tide TechFlow

Deep Tide Brief: Anthropic has released a cutting-edge model preview that is not yet publicly available, Claude Mythos Preview. Its code-auditing capabilities exceed those of the vast majority of human security experts. It can independently discover zero-day vulnerabilities that have existed for decades.

Based on this capability, Anthropic, together with 12 technology giants including AWS, Apple, Google, Microsoft, and NVIDIA, launched the Project Glasswing initiative. The effort includes a $100 million credit line, aiming to patch vulnerabilities in globally critical software before attackers can gain equivalent capabilities.

Introduction

Today we are announcing Project Glasswing, a new initiative bringing together Amazon Web Services (AWS), Anthropic, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorgan Chase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks. The goal is to protect the security of the world’s most critical software.

We are launching Project Glasswing because a new frontier model trained by Anthropic has demonstrated capabilities we believe could reshape the cybersecurity landscape. Claude Mythos Preview is a general-purpose, as-yet-unreleased frontier model. It reveals a harsh reality: AI models’ coding abilities have already reached a level where—when it comes to finding and exploiting software vulnerabilities—they can surpass everyone except the very top experts.

Mythos Preview has already discovered thousands of high-severity vulnerabilities, spanning every major operating system and every major browser. Given the pace of AI progress, this capability will spread in the near future—possibly into the hands of irresponsible users. The impact on the economy, public safety, and national security could be severe. Project Glasswing is a pressing attempt to ensure these capabilities are prioritized for defense.

As part of Project Glasswing, the partners above will use Mythos Preview in their defensive security work; Anthropic will share what it has learned so the entire industry benefits. We are also opening access to more than 40 additional organizations that build or maintain critical software infrastructure so they can scan and harden their own systems and open-source systems. Anthropic has committed up to $100 million in Mythos Preview usage credits for this purpose, as well as $4 million in direct donations to open-source security organizations.

Project Glasswing is only a starting point. No single organization can solve cybersecurity on its own: frontier AI developers, other software companies, security researchers, open-source maintainers, and governments around the world all have irreplaceable roles. Defending global cyber infrastructure may take years; but frontier AI capabilities could make a major leap in the coming months. To stay ahead, cyber defenders must act now.

Cybersecurity in the AI Age

The software we rely on every day—running banking systems, storing medical records, connecting logistics networks, and keeping the power grid operating—has always had bugs. Most are minor and irrelevant, but some are serious security flaws. Once discovered, attackers can hijack systems, cripple operations, or steal data.

The destructive consequences of cyberattacks on enterprise networks, medical systems, energy infrastructure, transportation hubs, and government agencies are well known. At the global level, state-level attacks from China, Iran, North Korea, and Russia have threatened infrastructure that supports civilian life and military readiness. Even small-scale attacks targeting a single hospital or school can cause enormous economic losses, expose sensitive data, and even put lives at risk. It is difficult to estimate the annual economic losses from global cybercrime precisely, but they may be on the order of $500 billion.

In the past, many software flaws went undiscovered for years because finding and exploiting them required specialized knowledge held by only a tiny number of security experts. But with the emergence of the latest frontier AI models, the cost, effort, and professional threshold required to find and exploit software vulnerabilities have dropped dramatically. Over the past year, AI models have become increasingly capable at code reading and reasoning. In particular, they have shown remarkable performance in discovering vulnerabilities and constructing exploitation paths. Claude Mythos Preview has achieved a step-change improvement across these cybersecurity skills. Some of the vulnerabilities it found had survived human review for decades and millions of rounds of automated security testing, while the exploit code it developed has become increasingly elegant.

A decade after the first DARPA Cyber Grand Challenge, frontier AI models are nearing—and in some respects catching up to—the best human capabilities for vulnerability discovery and exploitation. Without necessary security measures, these powerful cyber capabilities could be used to exploit the many existing defects in the world’s most important software. Cyberattacks would become more frequent and more destructive, and would strengthen adversaries of the United States and its allies. This is a cybersecurity priority that democracies must take seriously.

The good news is that the same capabilities that make AI dangerous in the wrong hands are also extremely valuable for discovering and fixing critical software flaws—helping produce new software with fewer secure bugs. Project Glasswing is an important step toward giving defenders a lasting advantage in the coming AI-driven cybersecurity era.

The capability of Claude Mythos Preview to find vulnerabilities and exploitation methods

Over the past few weeks, using Claude Mythos Preview we discovered thousands of zero-day vulnerabilities—flaws that the software developers previously did not know existed—in every major operating system, every major browser, and a range of other important software. Many of these were high-severity.

On the Frontier Red Team blog, we disclosed technical details of some vulnerabilities that have been fixed, along with the exploitation methods found by Mythos Preview. Nearly all of the discovery of these vulnerabilities (and the development of many related exploitation methods) was done fully autonomously by the model, with no human guidance. Here are three examples:

Mythos Preview found a 27-year-old vulnerability in OpenBSD. OpenBSD is known for extremely high security hardening and is widely used for firewalls and other critical infrastructure. The vulnerability allows attackers to remotely crash the target machine simply by connecting to it.

It also found a 16-year-old vulnerability in FFmpeg. FFmpeg is used by countless software applications for video codecs. The problem was in a single line of code, and automated testing tools hit that line 5 million times, yet never found the issue.

The model autonomously discovered and stitched together several vulnerabilities in the Linux kernel (the Linux kernel runs most of the world’s servers), enabling a privilege-escalation attack that moves from a normal user’s permissions to full control of the machine.

We have reported all the vulnerabilities above to the relevant software maintainers, and they have all been fixed. For many other vulnerabilities, we provide today’s details in the form of encrypted hashes (see the Red Team blog), with the specific information to be made public once the fixes are completed.

Benchmarks such as CyberGym also validate a significant gap between Mythos Preview and our next-stronger model, Claude Opus 4.6:

Vulnerability Reproduction - CyberGym

In addition to our own work, many partners have used Claude Mythos Preview for weeks. Here is their feedback:

“AI capabilities have crossed a threshold, fundamentally changing the urgency required to protect critical infrastructure from cyber threats—and it’s irreversible. Our foundational work with these models shows that it is possible to identify and fix security vulnerabilities in hardware and software at unprecedented speed and scale. This is a profound shift and a clear signal: the old ways of hardening systems are no longer sufficient. Technology providers must immediately and proactively adopt new methods, and customers must be ready for deployment. This is why Cisco joined Project Glasswing—this work is too important and too urgent to do alone.”

— Anthony Grieco, Senior Vice President and Chief Security and Trust Officer at Cisco

“At AWS, we build defenses before threats emerge—from custom chips to the entire technology stack. Security isn’t something you do at one stage; it’s continuous and embedded in everything we do. Our teams analyze more than 4 quadrillion network traffic events every day to detect threats, and AI is at the core of our ability to defend at massive scale. We have been testing Claude Mythos Preview in our own security operations, applying it to key codebases, and it is already helping us harden code. We are bringing deep security expertise into our collaboration with Anthropic, and helping strengthen Claude Mythos Preview so more organizations can advance their work to the highest security standards.”

— Amy Herzog, Vice President and Chief Information Security Officer at AWS

“When cybersecurity is no longer limited to purely human capabilities, the opportunity to responsibly use AI to improve security at scale and reduce risk is unprecedented. Joining Project Glasswing and getting access to Claude Mythos Preview allows us to identify and mitigate risks early, strengthening our security and development solutions to better protect customers and Microsoft. When we tested on our open-source security benchmark CTI-REALM, Claude Mythos Preview showed a substantial improvement compared with prior models. We look forward to collaborating with Anthropic and the broader industry to improve security outcomes for everyone.”

— Igor Tsyganskiy, Executive Vice President for Cybersecurity and Microsoft Research at Microsoft

“The window period from vulnerability discovery to exploitation by an attacker has collapsed—from something that used to take months, to something that can be done in minutes with AI. Claude Mythos Preview demonstrates what large-scale action by defenders could look like, while adversaries will inevitably seek to exploit the same capabilities. This is not a reason to slow down; it is a reason to accelerate together. Deploying AI requires security assurances. That’s why CrowdStrike has been involved since Day 1.”

— Elia Zaitsev, Chief Technology Officer at CrowdStrike

“In the past, security expertise was a luxury that only organizations with large security teams could afford. Open-source maintainers—whose software underpins most of the world’s critical infrastructure—have historically had to figure out security problems for themselves. Open-source software makes up the overwhelming majority of code in modern systems, including the very systems that AI agents use to write new software. By giving maintainers of these critical open-source codebases access to a new generation of AI models—capable of proactively identifying and fixing vulnerabilities at scale—Project Glasswing provides a realistic path to change the situation. This is how AI-enhanced security can shift from being an exclusive tool for large teams to being a reliable assistant for every maintainer.”

— Jim Zemlin, CEO of the Linux Foundation

“Improving the cybersecurity and resilience of the financial system is central to JPMorgan Chase’s mission, and we believe that when leading institutions come together to challenge problems collaboratively, the industry is at its strongest. Project Glasswing provides a unique early opportunity for us to evaluate the capabilities of next-generation AI tools in defensive cybersecurity for critical infrastructure against our own standards—while fighting alongside respected technology leaders. We will take a rigorous, independent approach to determine how to proceed and how we can help. Anthropic’s initiative reflects the foresight and collaborative approach that this moment requires.”

— Pat Opet, Chief Information Security Officer at JPMorgan Chase

“Google is excited to see this cross-industry cybersecurity initiative come together, and to provide Mythos Preview to participants via Vertex AI. Collaboration across the industry on emerging security challenges has always been crucial—whether it’s post-quantum cryptography, responsible zero-day vulnerability disclosure, open-source software security, or AI-based defense against attacks. We have long believed that AI brings both new challenges and new opportunities in cyber defense—which is why we built AI-driven tools like Big Sleep and CodeMender to discover and fix critical software flaws. We will continue investing in leading cybersecurity platforms and a culture centered on protecting users, customers, ecosystems, and national security.”

— Heather Adkins, Vice President of Security Engineering at Google

“In the past few weeks, we have been using the Claude Mythos Preview model to identify complex vulnerabilities that the previous generation of models completely missed. This not only changes the game for finding hidden vulnerabilities, but also means attackers will soon be able to discover more zero-day vulnerabilities and develop exploit code faster than ever before. Clearly, these models need to be put into the hands of owners of open-source projects and all defenders to find and fix vulnerabilities before attackers gain access. Perhaps even more importantly: everyone needs to be ready for AI-assisted attackers. Attacks will be more frequent, faster, and more complex. Now is the time to comprehensively upgrade the cybersecurity ecosystem. We appreciate Anthropic’s collaboration with the industry to ensure these powerful capabilities are prioritized for defense.”

— Lee Klarich, Chief Product and Technology Officer at Palo Alto Networks

The powerful cybersecurity capabilities of Claude Mythos Preview stem from its outstanding agent coding and reasoning abilities. The evaluation results below show that the model achieves the highest scores among all known models across multiple software coding tasks.

Agent coding

Reasoning

Agent search and computer use

Notes:

SWE-bench Verified, Pro, and Multilingual: Memorization filtering tags mark some of the tasks. After excluding tasks that may involve memorization, Mythos Preview’s advantage over Opus 4.6 remains unchanged.

SWE-bench Multimodal: Using an internal implementation, the scores are not directly comparable to the public leaderboard.

Terminal-Bench 2.0: Using the Terminus-2 framework, maximum effort under an adaptive thinking mode; each task has a total budget of 1 million tokens, with 1x guaranteed / 3x upper-limit resource allocation; take the average across 5 attempts per task. After increasing the timeout limit to 4 hours and updating to Terminal-Bench 2.1, Mythos Preview scores 92.1%.

BrowseComp: Claude Mythos Preview scores higher than Opus 4.6, with token consumption only 1/4.9 of the latter.

Humanity’s Last Exam: Mythos performs well even in a low-effort mode; there may be some degree of memorization.

For more information about the model’s capabilities, safety properties, and fundamental characteristics, see the Claude Mythos Preview system card.

We do not plan to make Claude Mythos Preview available to the public, but the ultimate goal is to enable users to safely deploy Mythos-level models at scale—not only for cybersecurity, but also for many other values that come from having high-capability models. To do this, we need to make progress in developing cybersecurity (and other) safety mitigations that can detect and block the model’s most dangerous outputs. We plan to publish new safety mitigations in the upcoming Claude Opus model, enabling us to improve and refine these mitigations with a model that does not pose the same level of risk as Claude Mythos Preview.

Next steps for Project Glasswing

Today’s launch is the beginning of a long-term effort. Success will require broad participation across and beyond the technology industry.

Project Glasswing partners will receive access to Claude Mythos Preview to discover and fix vulnerabilities and weaknesses in their underlying systems—systems that account for a large portion of the global shared attack surface. The work is expected to focus on local vulnerability detection, binary black-box testing, endpoint hardening, and system penetration testing.

The $100 million model usage credits that Anthropic has committed to Project Glasswing and other participants will cover substantial usage during the research preview period. After that, Claude Mythos Preview will be offered to participants at $25 per million input tokens / $125 per million output tokens (participants can access the model via the Claude API, Amazon Bedrock, Google Cloud Vertex AI, and Microsoft Foundry).

In addition to model usage credits, we have also donated $2.5 million via the Linux Foundation to Alpha-Omega and OpenSSF, and $1.5 million to the Apache Software Foundation. These donations will help open-source software maintainers deal with this changing landscape (maintainers interested in accessing the program can apply for access via the Claude for Open Source plan).

We intend for this work to expand in scope over the coming months, and to share as much experience as possible so other organizations can apply these learnings to their own security. Partners will share information and best practices with each other to the extent conditions allow. Within 90 days, Anthropic will publish a report of our findings, as well as the fixed vulnerabilities and improvements that can be disclosed. We will also work with leading security organizations to develop a set of practical recommendations on how security practices should evolve in the AI era. This may include: vulnerability disclosure processes, software update processes, open-source and supply-chain security, secure software development lifecycle and design practices, standards for regulated industries, triage expansion and automation, and patch automation.

Anthropic has also been discussing Claude Mythos Preview’s offensive and defensive cybersecurity capabilities with U.S. government officials. Protecting critical infrastructure is the top national security priority for democratic nations. The emergence of these cybersecurity capabilities once again underscores why the United States and its allies must maintain decisive leadership in AI technology. Governments have an indispensable role in helping maintain this leadership position and in assessing and mitigating national-security risks associated with AI models. We are eager to work with representatives from governments at all levels to support these efforts.

We hope Project Glasswing will catalyze a larger-scale effort spanning both the industry and the public sector, with everyone coming together to address the biggest challenge of the impact of powerful models on security. We invite other members of the AI industry to join in helping define industry standards. In the medium term, an independent third-party organization—able to bring together private and public sector organizations—could be an ideal platform to carry forward the subsequent work of these large-scale cybersecurity projects.

Addendum

The project is named after the glasswing butterfly (Greta oto). This metaphor has two layers of meaning: the butterfly’s transparent wings allow it to blend into invisibility, like the vulnerabilities hidden in code discussed in this article; and the transparent wings also help it avoid harm, like the transparent approach we advocate.

The word Mythos comes from Ancient Greek, meaning “narrative” or “story”: the story systems that civilizations use to understand the world.

Security professionals whose legitimate work is affected by these safety mitigations can apply for the upcoming Cyber Verification Program.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin