The spotlight at the launch was on SWE-bench, but the real signals were hidden in footnotes, introductory blocks, and an unassuming auto mode. The old OG finishes this coffee and breaks it down for you.

ZOMBIE CAFÉ · APR 16, 2026 · PALO ALTO

On California Ave in Palo Alto, the morning light at 9:30 a.m. slants through the glass window of Coupa Café, shining on Alan Walker’s half-cold flat white. He just finished browsing Anthropic’s official website, leans back in his chair, and speaks to Tony, who just sat down across from him.

“Anthropic released Opus 4.7 this time, and the press conference was quite restrained — the main features are the SWE-bench pillars, customer quote carousel, and a beautiful alignment diagram. Most tech media just copied the press release and left.”

“But the real substance is buried in footnotes, migration guides, and a casual line like ‘auto mode extended to Max users.’ You have to read it like a 10-K — the main text is for retail investors, the footnotes are for institutions.”

“Before I finish this coffee today, I’ll reveal eight knives. Each one I tell you who it’s aimed at.”

—— BLADE NO. 01

xhigh is not a tier upgrade — Default has been secretly raised

The press conference briefly mentioned: “In Claude Code, we’ve raised the default effort level to xhigh for all plans.”

Most people see xhigh and think it’s just “another tier,” like an extra color for the iPhone. Wrong. The real signal is in the last half — all plans in Claude Code now have their default tier set to xhigh.

This is a very Anthropic move: quietly raising everyone’s baseline by one tier, while keeping compute bills unchanged. It’s like giving you a smarter colleague without increasing your salary.

TONY: Wait, so that means Pro users who used to pay $20 for medium now directly get xhigh?

ALAN: Exactly. And read that quote from Hex carefully — “low-effort 4.7 ≈ medium-effort 4.6.” With the default tier raised, the effective intelligence normal users get jumps two tiers. The press didn’t highlight this number because they didn’t want the token consumption page to look bad.

Implementation scenario

On Monday morning, you ask Claude Code to modify a 500-line backend module — previously, you had to manually run /effort max before it would work; now, you’re not qualified, it defaults to xhigh, and after a coffee, the work is done. The difference isn’t just 10% faster — it’s “you don’t need to care anymore.”

KILL LIST

→ SaaS for “AI tuning / prompt configuration” — tools that teach you how to tune thinking budgets and effort, with default values automatically correct, middle-layer businesses are gone

→ Junior engineer roles — the work done at xhigh default is already the minimum quality of a three-year experienced engineer

→ Outsourced code review companies — the third knife will lock this down

—— BLADE NO. 02

Auto Mode — The silent revolution in Permission UI

The third footnote at the conference: “Auto mode extended to Max users.” Just one sentence.

Anthropic’s official words: “auto mode is a new permissions option where Claude makes decisions on your behalf.” — “Decides for you.”

All agent startups over the past year have been caught between two extremes: either skip-all-permissions and go all-in (Devin, Cognition), or flood with approve/deny pop-ups (early Cursor). Anthropic took the third path: train the model to judge what to ask and what not to ask, internalizing this judgment into auto mode.

KAI: Alan, what’s the essential difference between this and skip permissions? Aren’t they both just letting it run?

ALAN: The difference is huge. Skip is you pulling the safety pin, responsible if something goes wrong. Auto is the model installing a safety system — it actively stops dangerous operations to ask you, handles low-risk ones on its own. Essentially, it moves the entire “permission UI” layer from the product shell into the model weights.

TONY: So YC’s bunch of startups doing “agent governance / guardrails”…

ALAN: The product is now embedded into the model. That’s the example Andrej mentioned last year: “the model is the product.”

KILL LIST

→ Agent guardrails / approval-flow SaaS — those doing “human-AI collaborative approval platforms,” the entire category is being flattened

→ RPA traditional industry (UiPath / Automation Anywhere) — their core value is “controllable automation,” now controllability is internalized

→ Back-end data entry in BPO industry — Philippines, India, handling data input, customer dispatch, invoice reconciliation, a whole team’s work in one day of auto mode

—— BLADE NO. 03

/ultrareview — A kill order for Senior Engineers

Official description: “a dedicated review session that reads through changes and flags bugs and design issues that a careful reviewer would catch.”

Note that phrase — “a careful reviewer.” Not junior, not linter, but “careful reviewer.” In plain language: senior engineer.

CodeRabbit’s David Loker provides a more direct number: recall rate increases over 10%, catching the hardest bugs in the most complex PRs, with almost no drop in precision. Recall improves, precision stays the same — in code review, this is the holy grail. The last to achieve this was Google’s internal Tricorder, which took ten years.

MARCUS: Our FAANG staff engineers spend half their time reviewing PRs — $800K a year. If this really works…

ALAN: Pro and Max users get three free ultrareviews to test it out. It’s the Silicon Valley “freemium poisoning” tactic — let you taste it, then make it hard to go back.

MARCUS: So this isn’t just a tool, it’s a substitute.

ALAN: Not entirely. It doesn’t replace staff; it replaces those two hours every afternoon when staff review ten PRs. The two hours freed up, the senior becomes truly senior, not a human GitHub bot.

Implementation scenario

A 20-person engineering team’s tech lead used to spend three hours daily reviewing PRs. After using /ultrareview, the tech lead only needs to look at the few “design issues” flagged in red by Claude — three hours down to twenty minutes, freeing that time for architecture work. This isn’t “AI assistance,” it’s a rewrite of job responsibilities.

KILL LIST

→ All independent AI code review startups — CodeRabbit, Codacy, Qodo, now features of Anthropic

→ Traditional static analysis / dynamic security testing tools (Snyk / Checkmarx) — rule-based static scans overtaken by “reading code like a human”

→ Outsourced code review services in India / Eastern Europe — this market was worth billions, now it’s evaporated

—— BLADE NO. 04

2,576 Pixels Vision — Computer-Use from Demo to Weapon

“Acceptable image longest edge up to 2,576 pixels, about 3.75 megapixels, three times more than before.”

This is the most underestimated point. Most see it and think “Oh, higher resolution.” Wrong. This is the watershed for computer-use category from demo to production.

Evidence is in the quote block at the bottom of the release page, where XBOW CEO Oege de Moor said:

54.5% → 98.5%. This isn’t a gradual improvement — it’s a leap from “cannot use” to “must use.” Opus 4.6 still guesses where buttons are on the screen; 4.7 can read tiny text and nested tables on dense dashboards.

SARAH: Our enterprise clients have been stuck here. 4.6 can automatically process invoice scans, but with half errors — the boss just says “Stop playing.”

ALAN: Now, the 98.5% figure means RPA, IT operations, expense auditing, legacy system migration — all workflows that still rely on human eyes — now have a reliable baseline model.

KAI: Computer use is no longer just a demo video; it’s productivity.

ALAN: Yes, and note — this is a model-level upgrade, not an API parameter. old users don’t change anything, it’s automatically upgraded. Anthropic is quietly pushing all integrators’ product capabilities upward.

KILL LIST

→ OCR / document understanding SaaS (Rossum / Hyperscience / Nanonets) — their moat was “vision + structured data,” now matched or surpassed by general models

→ Traditional RPA giants — UiPath’s core screen recognition tech, value halved overnight

→ Enterprise data entry departments — healthcare claims, bank KYC, government forms, entire manual pipelines

→ Autonomous penetration testing / red team industry — companies like XBOW benefit, traditional pentesting consulting is being disrupted

—— BLADE NO. 05

File-System Memory — Anthropic chose the simplest path

A footnote at the launch: “Opus 4.7 is better at using file system-based memory. It remembers important notes across long, multi-session work.”

OpenAI’s approach: “embedded memory” — embedding memory inside the model, invisible and uneditable. Google is working on mysterious infini-attention. Anthropic’s move: file system is memory. Claude writes .md notes, reads .md notes, you can cat anytime.

This choice seems low-tech but is actually a victory of first principles. The core issue of memory isn’t storage — it’s auditability, editability, and transferability. Vector databases and embedded memory violate these three.

ERIC: Enterprise clients are most afraid of “what this AI remembers about me, and I don’t know.”

ALAN: File system memory directly solves compliance. GDPR delete rights? rm it. SOC2 audit? cat it for auditors. This isn’t a technical advantage, it’s a legal one.

ERIC: So those startups doing “AI memory layer”…

ALAN: Mem0, LangMem, Zep — raised a lot of money this year. They solve “models don’t manage memory themselves,” Anthropic embedded this ability into the model, using the simplest POSIX file system. Middle layers are skipped.

KILL LIST

→ AI Memory infrastructure startups (Mem0 / LangMem / Zep) — their value proposition is internalized into the model

→ Use cases of vector database agentic memory — Pinecone, Weaviate’s main narrative affected

→ AI-enhanced enterprise knowledge management SaaS — no need for third-party middleware, Claude reads and writes project files directly

—— BLADE NO. 06

Task Budgets — Giving Agents brakes and then releasing the throttle

“Giving developers a way to guide Claude’s token spend so it can prioritize work across longer runs.” (public beta)

This was overlooked by all media, but it’s the most important engineering breakthrough for long-term agents this year.

Over the past year, all agent companies have faced the same demon: token runaway in long tasks. Give Devin or Cursor a complex task, it runs for two hours, comes back saying it burned $800, and only half the work is done. Bosses see the bill and go green.

Task budget design is clever — not just a token cap, but the model sees the remaining budget, decides which steps to skip, and how to maximize critical task completion.

CLAIRE: Isn’t this just “minimum viable product” thinking in project management?

ALAN: Yes. Anthropic trained the scope-cutting skill into the model. Give it $10 budget to run an agent, it will decide which features to finish at 80%, and which must reach 100%.

TONY: So that quote from Notion — “implicit-need tests” — will be the first to pass—

ALAN: Exactly. The model starts to have “resource awareness,” guessing what you didn’t say but expect, prioritizing within the budget. This is training “senior engineer judgment” into the model.

KILL LIST

→ AI cost-control / LLM observability startups (Helicone / Langfuse cost modules) — core features now native

→ Agent orchestration frameworks (some LangGraph / CrewAI usage) — models can plan budgets themselves, no external scheduling needed

→ Traditional consulting project management — “resource allocation + delivery trimming” intelligence now done by the model

—— BLADE NO. 07

Proof before coding — Vercel’s new behavior

Joe Haddad, Distinguished Engineer at Vercel: “It even does proofs on systems code before starting work, which is new behavior we haven’t seen from earlier Claude models.”

This line is buried among twenty quotes, no one amplifies it. But old OGs immediately put down their coffee upon hearing this. ☕

“proofs on systems code” — before writing system-level code, the model performs mathematical/formal proofs. It’s not smarter; it’s the model verifying its code using methods similar to PhD research papers.

MARCUS: This behavior appears in training data, indicating Anthropic explicitly rewarded “proof before code” during RL.

ALAN: Correct, it’s a conscious training. Combining Vercel’s “loop resistance,” Genspark’s “correctly reports when data is missing instead of plausible-but-incorrect fallbacks,” you see a complete training process: making the model work like a trustworthy engineer.

MARCUS: Not easily fooled — meaning no self-deception.

ALAN: Exactly. Opus 4.7 no longer fabricates plausible solutions just to complete tasks. This is a real-world alignment implementation.

KILL LIST

→ Formal verification tools niche (some of) — parts of high-threshold tools like Coq/Lean/TLA+ that the model can now handle

→ High-frequency trading / blockchain security auditing — core work (“finding invariants violations”) now collaborative with models, reducing audit costs

→ Operating system kernels / embedded systems — niches requiring proof-based reasoning, now leveled

—— BLADE NO. 08

Cyber Verification — Regulatory arbitrage window opens

“During its training we experimented with efforts to differentially reduce these capabilities.”

The boldest move here. Anthropic admits that during training, they actively reduced Opus 4.7’s offensive and defensive capabilities because the more powerful Mythos Preview was not released. Then —

They launched a Cyber Verification Program, allowing certified security researchers, pentesters, red team members to unlock higher permissions.

ERIC: Isn’t this just model version export controls?

ALAN: More precisely, “capability KYC.” The model has three capability gates; you prove your identity to unlock each level. For the first time, AI companies are openly pricing capability access.

ERIC: What does this mean for startups?

ALAN: First, general “AI + security” startups wanting high-end scenarios must get Anthropic certification, supply chains are now regulated. Second, a new category emerges: consulting firms helping you pass Anthropic certification — like today’s SOC2 compliance companies. Third, this is Anthropic practicing how to release future frontier models; Mythos will only get stricter.

TONY: So Palantir, Booz Allen, those government compliance firms…

ALAN: They get a built-in moat. They already have clearance, now they can unlock top-tier models naturally.

Implementation scenario

A YC startup aiming at AI pentesting, from Q2 2026, the first page of their business plan must answer “Have you obtained Anthropic Cyber Verification?” No? No VC funding. Yes? Valuation doubles. One certification, a watershed in capital markets.

KILL LIST & New Tracks

→ General cybersecurity SaaS — without Anthropic certification, top-tier model capabilities are locked

→ “AI model capability compliance consulting” — within 12 months, a wave of intermediaries helping enterprises get frontier model certification

→ Traditional military / government integrators (Palantir / Booz Allen) — benefit naturally, barriers become moats

→ Open source / on-premise — Llama, Qwen, DeepSeek routes benefit, “no certification needed” as a core selling point

Alan Walker pushes an empty cup to the table, closes his MacBook.

Outside the window, the sun on California Ave has climbed over the roof of Palo Alto Creamery, casting slanting light on the glass.

“Eight knives, pointing in eight directions. Some tracks die today, some start today.”

“With each frontier model release, the real stuff isn’t in the headlines.” He says to Tony, “The press is for analysts. Footnotes and quote numbers are for us.”

“Don’t just watch the show.”

— Alan

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

Reward
like
Comment
Repost
Share

Comment

Add a comment

No comments

Trending Topics
View More
#
GatePreIPOsLaunchesWithSpaceX
169.43K Popularity
#
Gate13thAnniversaryLive
694.73K Popularity
#
AltcoinsRallyStrong
7.3M Popularity
#
AnthropicvsOpenAIHeatsUp
1.05M Popularity
#
KalshiFacesNevadaRegulatoryClash
451.53K Popularity

Sitemap

Anthropic’s latest model Opus 4.7 has 8 Hidden Blades

Trending Topics

GatePreIPOsLaunchesWithSpaceX

Gate13thAnniversaryLive

AltcoinsRallyStrong

AnthropicvsOpenAIHeatsUp

KalshiFacesNevadaRegulatoryClash

Pin