Anthropic, together with 12 industry giants, launched the "Glass Wing" project to patch global software vulnerabilities using the unreleased, most powerful model Mythos.

Project Glasswing is an important step toward helping defenders establish a lasting advantage in the coming AI-driven cybersecurity era.

Author: Anthropic

Compiled by: Deep Tide TechFlow

Deep Tide Insight: Anthropic has released a cutting-edge model Claude Mythos Preview that is not yet publicly available. Its code-auditing capabilities have surpassed those of the vast majority of human security experts—it can autonomously uncover zero-day vulnerabilities that have existed for decades.

Based on this capability, Anthropic, together with 12 technology giants including AWS, Apple, Google, Microsoft, and NVIDIA, launched the Project Glasswing initiative, putting up $100 million in credit额度. The goal is to patch vulnerabilities in the world’s critical software before attackers gain equivalent capabilities.

Introduction

Today, we are announcing Project Glasswing (the Glasswing initiative)—a new effort bringing together Amazon Web Services (AWS), Anthropic, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorgan Chase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks, with the goal of protecting the security of the world’s most critical software.

We launched Project Glasswing because a new frontier model trained by Anthropic has demonstrated capabilities we believe could reshape the cybersecurity landscape. Claude Mythos Preview is a general, not-yet-released frontier model. It reveals a harsh reality: AI models’ coding ability has already reached a level where, in terms of finding and exploiting software vulnerabilities, they can surpass everyone except the very top experts.

Mythos Preview has already found thousands of high-severity vulnerabilities, covering every major operating system and every major browser. At the pace of AI progress, this capability will spread in the not-too-distant future and may fall into the hands of irresponsible users. The impact on the economy, public safety, and national security could be severe. Project Glasswing is an urgent attempt to prioritize using these capabilities for defense.

As part of Project Glasswing, the partners above will use Mythos Preview in their defensive security work; Anthropic will share what it has learned so the entire industry benefits. We are also opening access to more than 40 additional organizations that build or maintain critical software infrastructure, enabling them to scan and harden their own systems and open-source systems. For this, Anthropic has committed to provide up to $100 million in Mythos Preview usage credits, along with a $4 million direct donation to open-source security organizations.

Project Glasswing is just a starting point. No single organization can solve cybersecurity on its own: frontier AI developers, other software companies, security researchers, open-source maintainers, and governments around the world all have roles that are irreplaceable. Securing global cyber infrastructure may take years; meanwhile, frontier AI capabilities could make a major leap within the next few months. Cyber defenders must act now if they want to get ahead.

Cybersecurity in the AI Era

The software we rely on every day—running banking systems, storing medical records, connecting logistics networks, and keeping the power grid operating—has always had bugs. Most are inconsequential, but some are serious security flaws; once discovered, attackers can take over systems, halt operations, or steal data.

The destructive consequences of cyberattacks on enterprise networks, healthcare systems, energy infrastructure, transportation hubs, and government agencies across countries are well known. On a global scale, state-level attacks from China, Iran, North Korea, and Russia have threatened the infrastructure that supports civilian life and military readiness. Even small-scale attacks targeting a single hospital or school can cause massive economic losses, expose sensitive data, and even cost lives. It is difficult to estimate the annual economic loss from global cybercrime precisely, but it could be around $500 billion.

In the past, many software defects went undiscovered for years because finding and exploiting them required specialized knowledge held by only a very small number of security experts. But with the arrival of the latest frontier AI models, the cost, effort, and professional barrier required to find and exploit software vulnerabilities have dropped dramatically. Over the past year, AI models have become increasingly strong at code reading and reasoning—especially in finding vulnerabilities and crafting exploitation techniques. Claude Mythos Preview delivers a step-change improvement in these cybersecurity skills: some of the vulnerabilities it finds have remained after human reviewers have scrutinized them for decades and after millions of automated security tests, and the exploitation code it develops is becoming increasingly sophisticated.

A decade after the first DARPA Cyber Grand Challenge, frontier AI models are approaching—and in some cases matching—the top-tier human capabilities for vulnerability discovery and exploitation. Without necessary safeguards, these powerful cyber capabilities could be used to exploit the many existing flaws in the world’s most important software. Cyberattacks will become more frequent and more destructive, and will give adversaries to the United States and its allies more power. This is a security priority that democratic nations must take seriously.

The good news is that the very capabilities that make AI models dangerous in the wrong hands are also extremely valuable for finding and fixing important software vulnerabilities—helping produce new software with fewer secure bugs. Project Glasswing is an important step for defenders to establish a lasting advantage in the coming AI-driven cybersecurity era.

Capabilities of Claude Mythos Preview to Find Vulnerabilities and Exploit Them

Over the past few weeks, we used Claude Mythos Preview to discover thousands of zero-day vulnerabilities (i.e., defects that the software developers previously had no knowledge of) across every major operating system, every major browser, and a range of other important software—many of them at high severity.

On the Frontier Red Team blog, we disclosed technical details of some vulnerabilities that have already been fixed, along with the exploit techniques found by Mythos Preview. Nearly all the discovery (and the development of many of the associated exploit techniques) was done entirely autonomously by the model, with no human guidance. Here are three examples:

  • Mythos Preview discovered a vulnerability in OpenBSD that had existed for 27 years. OpenBSD is well known for exceptionally strong security hardening and is widely used in firewalls and other critical infrastructure. The flaw allows an attacker to remotely crash the target machine simply by connecting to it.
  • It also found a vulnerability in FFmpeg that had existed for 16 years. FFmpeg is used by countless software applications for video codecs. The problem was in a single line of code—automated testing tools had hit that line 5 million times, yet they never found the issue.
  • The model autonomously found and chained several vulnerabilities in the Linux kernel (which runs most of the world’s servers), enabling a privilege-escalation attack from ordinary user permissions to full control of the machine.

We have reported all the above vulnerabilities to the relevant software maintainers, and they have all been fixed. For many other vulnerabilities, today we are providing the details of their encrypted hashes (see the Red Team blog); specific information will be made public once fixes are complete.

Evaluation benchmarks such as CyberGym also confirm a significant gap between Mythos Preview and our next-strong model Claude Opus 4.6:

Cybersecurity Vulnerability Reproduction - CyberGym

Beyond our own work, many partners have also used Claude Mythos Preview for weeks. Here is their feedback:

“AI capabilities have crossed a threshold—fundamentally changing the urgency required to protect critical infrastructure from cyber threats—and it is irreversible. Our foundational work with these models shows that security vulnerabilities in hardware and software can be identified and fixed at unprecedented speed and scale. This is a profound shift, and a clear signal: old approaches to system hardening are no longer sufficient. Technology providers must adopt new methods immediately and proactively, and customers need to prepare for deployment. This is why Cisco is joining Project Glasswing—this work is too important, too urgent, to be done alone.”

— Anthony Grieco, Senior Vice President and Chief Security and Trust Officer at Cisco

“At AWS, we build defenses before threats emerge—from custom chips to the entire technology stack. Security is not something you do at one stage; it is continuous and embedded in everything we do. Our teams analyze more than 400 trillion network traffic events every day to detect threats, and AI is at the core of our large-scale defensive capabilities. We have been testing Claude Mythos Preview in our own security operations, applying it to critical codebases—it has already been helping us harden code. We are injecting deep security expertise into our collaboration with Anthropic, and helping strengthen Claude Mythos Preview so more organizations can advance their work to the highest security standards.”

— Amy Herzog, Vice President and Chief Information Security Officer at Amazon Web Services

“When cybersecurity is no longer limited by purely human capacity, the opportunity to responsibly use AI to scale up security improvements and reduce risk is unprecedented. Joining Project Glasswing and gaining access to Claude Mythos Preview allows us to identify and mitigate risks early, strengthening our security and development solutions, and better protecting customers and Microsoft. When we tested on our open-source security benchmark CTI-REALM, Claude Mythos Preview showed a substantial improvement over prior models. We look forward to working with Anthropic and the broader industry to improve security outcomes for everyone.”

— Igor Tsyganskiy, Executive Vice President of Microsoft Cybersecurity and Research

“The window period from vulnerabilities being discovered to being exploited by attackers has collapsed—what used to take months can now be done in minutes with AI. Claude Mythos Preview demonstrates the possibility of defenders acting at scale, while adversaries will inevitably seek to exploit the same capabilities. This is not a reason to slow down—it is a reason to accelerate together. To deploy AI, you must have security assurances. That is why CrowdStrike has been involved from day one.”

— Elia Zaitsev, Chief Technology Officer at CrowdStrike

“Historically, security expertise has been a luxury only organizations with large security teams could afford. Open-source software maintainers—whose software underpins much of the world’s critical infrastructure—have long had to figure out security challenges largely on their own. Open-source software constitutes the vast majority of code in modern systems, including the very systems that AI agents use to write new software. By giving maintainers of these critical open-source codebases access to a new generation of AI models—capable of proactively identifying and fixing vulnerabilities at scale—Project Glasswing provides a practical path to change this situation. This is how AI-enhanced security can move from being an exclusive tool for large teams to a reliable assistant for every maintainer.”

— Jim Zemlin, CEO of the Linux Foundation

“Advancing the cybersecurity and resilience of the financial system is central to JPMorgan Chase’s mission, and we believe the industry is strongest when leading institutions come together to challenge and collaborate. Project Glasswing provides a unique early opportunity to evaluate, against our own standards, the capabilities of next-generation AI tools for defensive cybersecurity of critical infrastructure—while working side by side with respected technical leaders. We will take a rigorous, independent approach to determine how to proceed and how to help. Anthropic’s initiative reflects the forward-looking, collaborative approach this moment calls for.”

— Pat Opet, Chief Information Security Officer at JPMorgan Chase

“Google is excited to see this cross-industry cybersecurity initiative take shape, and to provide Mythos Preview to participants via Vertex AI. Collaboration across the industry on emerging security problems has always been crucial, whether it is post-quantum cryptography, responsible zero-day vulnerability disclosure, open-source software security, or AI-driven attacks that must be defended against. We have long believed that AI brings both new challenges and new opportunities for cyber defense—that is why we built AI-driven tools like Big Sleep and CodeMender to find and fix critical software vulnerabilities. We will continue investing in leading cybersecurity platforms and a culture centered on protecting users, customers, the ecosystem, and national security.”

— Heather Adkins, Vice President of Security Engineering at Google

“In the past few weeks, we have been using Claude Mythos Preview to identify complex vulnerabilities that previous generations of models completely missed. Not only does this change the rules of the game for finding hidden vulnerabilities—it also means attackers will soon be able to discover more zero-day vulnerabilities and develop exploit code faster than ever. Clearly, these models need to be put into the hands of open-source project owners and all defenders before attackers gain access. And perhaps even more importantly: everyone needs to be prepared for attackers assisted by AI. Attacks will be more frequent, faster, and more complex. Now is the time for a comprehensive upgrade to cybersecurity. We appreciate Anthropic’s collaboration with the industry to ensure these powerful capabilities are prioritized for defense.”

— Lee Klarich, Chief Product and Technology Officer at Palo Alto Networks

The powerful cybersecurity capabilities of Claude Mythos Preview come from its outstanding agent coding and reasoning abilities. The following evaluation results show that, across multiple software coding tasks, this model achieves the highest scores among all known models.

Agent Coding

Reasoning

Agent Search and Computer Use

Notes:

  • SWE-bench Verified, Pro, and Multilingual: Memorization filtering marks some of the tasks. After excluding tasks that may have memorization, Mythos Preview’s advantage over Opus 4.6 remains unchanged.
  • SWE-bench Multimodal: uses internal implementations, so scores cannot be directly compared with the public leaderboard.
  • Terminal-Bench 2.0: using the Terminus-2 framework; with the maximum-effort setting in adaptive thinking mode, each task has a total budget of 1 million tokens. A 1x guarantee / 3x upper-bound resource allocation; average over 5 attempts per task. After increasing the timeout limit to 4 hours and updating to Terminal-Bench 2.1, Mythos Preview scored 92.1%.
  • BrowseComp: Claude Mythos Preview scores higher than Opus 4.6, while token consumption is only 1/4.9 of the latter.
  • Humanity’s Last Exam: Mythos performs well even in the low-effort mode; there may be some degree of memorization.

For more information about the model’s capabilities, safety properties, and basic characteristics, please refer to the Claude Mythos Preview system card.

We do not plan to make Claude Mythos Preview available to the public, but the end goal is to enable users to safely deploy Mythos-level models at scale—not only for cybersecurity, but also for many other values that models with this level of capability can bring. To do this, we need to make progress in developing cybersecurity (and other) safety measures that can detect and prevent the model’s most dangerous outputs. We plan to publish new safety measures in the upcoming Claude Opus model, so that we can improve and refine these measures using a model that does not carry the same risk level as Mythos Preview.

Next Steps for Project Glasswing

Today’s release marks the beginning of a long-term effort. To succeed, it will require broad participation across and beyond the technology industry.

Project Glasswing partners will receive access to Claude Mythos Preview to discover and fix vulnerabilities and weaknesses in their foundational systems—systems that account for a large share of the world’s shared cyberattack surface. The work is expected to focus on local vulnerability detection, binary black-box testing, endpoint hardening, and system penetration testing.

The $100 million in model usage credits that Anthropic has committed to Project Glasswing and other participants will cover extensive usage during the research preview period. After that, Claude Mythos Preview will be made available to participants at a price of $25 per million input tokens / $125 per million output tokens (participants can access the model via the Claude API, Amazon Bedrock, Google Cloud Vertex AI, and Microsoft Foundry).

In addition to the model usage credits, we have also donated $2.5 million via the Linux Foundation to Alpha-Omega and OpenSSF, and $1.5 million to the Apache Software Foundation, to help open-source software maintainers respond to this changing landscape (maintainers who are interested can apply for access through the Claude for Open Source program).

We intend to keep this effort expanding in scope over the coming months, and to share as much experience as possible so other organizations can apply these learnings to their own security. Partners will share information and best practices with each other to the extent conditions allow; within 90 days, Anthropic will publish a report of our findings, along with the vulnerabilities that have been fixed and the mitigations that can be disclosed. We will also work with leading security organizations to develop a set of practical recommendations for how security practices should evolve in the AI era, potentially covering: vulnerability disclosure processes, software update processes, open-source and supply-chain security, software development lifecycle and security design practices, standards in regulated industries, triage expansion and automation, and patch automation.

Anthropic has also been discussing Claude Mythos Preview’s offensive and defensive cybersecurity capabilities with U.S. government officials. Protecting critical infrastructure is the top national security priority for democratic nations—these cyber capabilities underscore again that the United States and its allies must maintain decisive leadership in AI technology. Governments play an indispensable role in helping maintain this leadership and in assessing and mitigating national security risks associated with AI models. We are willing to work with government representatives at all levels to help with these tasks.

We hope Project Glasswing will catalyze a larger, scaled effort that spans both the private sector and the public sector, with everyone working together to tackle the biggest problem posed by powerful models’ impact on security. We invite other members of the AI industry to join and help shape industry standards. In the medium term, an independent third-party organization—one that can bring together both private and public-sector organizations—could be an ideal platform to host the follow-on work of these large-scale cybersecurity initiatives.

Additional Note

  1. The project is named after the Greta oto glasswing butterfly. This metaphor has two layers of meaning: the butterfly’s transparent wings help it remain unseen, like the vulnerabilities hidden in code discussed in this article; the transparent wings also help it avoid harm, like the transparency approach we advocate.
  2. The word “Mythos” comes from ancient Greek, meaning “narrative” or “story”: the story systems used by civilizations to understand the world.
  3. Security professionals whose legitimate work is affected by these safety measures may apply for the upcoming Cyber Verification Program.
View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments