Anthropic's late-night bloodbath of a $50 billion industry! The end of code auditing is here

Just now, Anthropic made another move!

Claude Code’s father officially announced: Claude Code has added a new feature—Code Review.

This time, it targets a $50 billion industry—code security auditing.

Anthropic’s newly released feature is a straightforward, bold challenge to the entire code security industry.

Some are exclaiming: The $50 billion industry has been overturned overnight by Anthropic!

Now, we can expect security stocks to plummet.

At Anthropic, nearly every PR has tested this system.

After months of testing, the results are as follows:

  • The proportion of PRs with substantive review comments increased from 16% to 54%.
  • Engineers believe the review results are incorrect in less than 1% of cases.
  • In large Pull Requests (over 1000 lines), 84% contain superficial issues, with an average of 7.5 problems per PR.

Currently, this feature is available as a research preview for the Claude Team and Enterprise beta.

A Nightmare for the $50 Billion Market

Anthropic’s product has caused a seismic shift in the global AI and cybersecurity (AppSec) communities.

Senior developers are exclaiming that the $50 billion code auditing industry has been toppled!

In the past, large companies paid traditional security vendors (like Snyk, Checkmarx, etc.) up to $50,000 or more annually for licenses, hiring specialized teams to scan and audit code to prevent bugs or security vulnerabilities from reaching production.

Now, Claude can deploy a team of AI agents to lurk inside your PRs, on standby 24/7.

Moreover, based on tokens, the cost per review is only about $15-25!

$50,000 versus $25—an enormous 2000-fold difference.

This isn’t just an update; it’s a clarion call ending traditional code auditing.

Code Review: The Most Painful Part for Developers

Ask any engineering team: what is the biggest bottleneck in software development?

Most will say: Code Review.

In recent years, AI coding capabilities have advanced rapidly—GitHub Copilot, Cursor, Claude Code, ChatGPT—developers using these tools have seen a huge increase in code output.

But here’s the problem—while code is produced at lightning speed, the number of reviewers hasn’t increased.

Anthropic found that over the past year, each engineer’s code output increased by 200%, yet many PRs are just quickly glanced at.

Even developers admit that many code reviews are just going through the motions.

As a result, many bugs, vulnerabilities, and logical issues slip into production.

That’s why many companies are willing to pay hefty sums for security scanning tools.

But here’s the catch—these tools are not very smart.

What are the problems with traditional code scanning tools?

If you’ve used traditional AppSec tools like Snyk, Checkmarx, Veracode, SonarQube, you probably feel the same: too many false positives.

These tools mostly rely on static rules and known vulnerability databases; they can scan code but can’t truly understand it.

A common scenario: the tool flags a “possible SQL injection risk,” but after checking, developers find no issue.

Gradually, warnings are ignored, and real problems often go unnoticed.

Therefore, companies still need extensive manual code reviews. Anthropic’s new approach automates this process.

Anthropic Unleashes an AI Code Review Army

The idea behind Claude Code Review is simple.

In Claude Code, the system can automatically analyze Pull Requests and check from multiple angles, such as:

  • Does the code conform to project standards?
  • Are there potential bugs?
  • Does the change conflict with existing logic?
  • Are issues from previous PRs recurring?

It outputs two results: a high-confidence summary comment and inline comments at specific code locations.

In other words, when you open a PR, you’ll see an AI review report highlighting truly important issues, not dozens of pages of boilerplate.

The era of “AI writes code, AI reviews” has finally arrived.

Claude’s self-looping and recursive capabilities are emerging.

As AI becomes more powerful, humans may only need to flip the AI switch—press the Claude button on the keyboard.

Multi-Agent System: Claude Code Review Army Deploys

The biggest feature of Claude Code Review is that it’s not just one AI, but a team.

When a PR is created, the system automatically launches a team of AI agents.

According to reports, Claude’s new code review feature dispatches multiple “review agents” to work in parallel, each responsible for different types of checks.

These agents verify to filter out false positives and prioritize errors by severity. The final output is a comprehensive high-signal review comment and inline comments targeting specific issues.

The review scale adjusts based on PR size.

Large or complex changes get more agents and deeper reviews; small changes are quickly approved. According to Anthropic’s tests, the average review time is about 20 minutes.

By mutual verification among multiple agents, false positives are reduced.

During this process, they focus on logical errors, security vulnerabilities, edge case flaws, and hidden regressions.

All issues are marked by severity level:

  • Red dots indicate common problems—bugs that should be fixed before merging;
  • Yellow dots indicate minor issues—recommend fixing but not blocking merge;
  • Purple dots indicate existing issues—bugs not introduced by this PR.

Each review comment includes a collapsible extended reasoning.

When expanded, you can see:

  • Why Claude flagged this issue
  • How it verified the problem’s existence

Note that these comments do not automatically approve or block PR merging, so they won’t disrupt the existing review process.

By default, Claude Code Review mainly focuses on code correctness.

That is, it primarily checks:

  • Bugs that could cause production failures
  • Actual logical issues

It does not heavily focus on code style, formatting preferences, or missing tests.

To expand the scope, users need to configure settings.

Internal Testing Results Are Terrifying

Anthropic’s internal testing results are frightening—further proof that traditional code review is basically a joke.

The internal data is startling: only 16% of PRs received substantive review comments.

In large PRs over 1000 lines, 84% contained issues detected by Claude, with an average of 7.5 bugs per PR.

Why? Because engineers are too busy.

Over the past year, each engineer’s code output increased by 200%. With more code, who has time to review line by line?

After implementing this feature, the proportion of PRs with substantive fixes shot up from 16% to 54%.

This means nearly 40% of potential problematic code was slipping past human reviewers, but now all are caught by Claude.

Even small PRs under 50 lines, previously thought to be safe, had 31% issues detected—one in three small changes contained bugs.

And the issues found had a recognition rate of over 99% by engineers! Less than 1% were marked as false positives.

This accuracy surpasses most human reviewers.

Anthropic shared an internal example: a one-line change in a production service, seemingly routine and usually quickly approved, was flagged as a serious issue.

This change would cause authentication failure—a failure mode easy to overlook in diff review but obvious once pointed out.

It was fixed before merging, and engineers later said they might not have noticed it themselves.

Here’s another real case.

iXsystems, a company working on TrueNAS, used Code Review to evaluate a ZFS encryption-related code refactor.

This was a deep technical change, reviewed by domain experts.

Unexpectedly, Code Review found a potential bug in the “adjacent code.”

The bug wasn’t in the core change but was related to the code being modified.

This type mismatch could cause the encryption key cache to be silently erased during sync.

A long-standing hidden bug, lurking unnoticed until now.

Humans are almost impossible to detect it because it’s not in the diff and not the focus of review, but someday it could crash your system.

But now, Code Review has caught it.

Industry Shake-up Is Coming

Now, security companies and SaaS vendors are wailing.

How long can security firms charging $50,000 a year survive?

It’s not that their technology is bad; the business model is changing.

If Anthropic’s AI agent team can handle deep business logic security audits for just $20, who would buy traditional scanners costing tens of thousands with high false positive rates?

If you’re still manually reviewing thousands of lines of code or paying high fees for security audits, wake up—times have changed.

Tonight, the stocks of AppSec industry might really feel the chill of AI.

Source: Xinzhiyuan

Risk Warning and Disclaimer

Market risks exist; investments should be cautious. This article does not constitute personal investment advice and does not consider individual users’ specific investment goals, financial situations, or needs. Users should determine whether any opinions, views, or conclusions in this article are suitable for their circumstances. Invest at your own risk.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
0/400
No comments
  • Pin