On the morning of March 24, 2026, I was sitting in the audience at YC W26 batch Demo Day, and when the fifth company took the stage for their presentation, I decided to stop taking notes.

It’s not that it’s unimportant, but I realized that what I recorded might be outdated by next month.

Among this batch of over a hundred companies, the work they’re doing is highly concentrated: about 80% are vertical agents, such as helping lawyers organize documents, assisting customer service with ticket distribution, or screening resumes for HR.

If I had seen these projects last October, I would probably think “quite innovative.” But the problem is, in these five months, the world has changed.

Claude Code has shifted from a developer-oriented tool to an interface that almost anyone can use directly. After Opus 4.6 was released, the entire vibe coding threshold was pushed to the floor.

Those vertical agents, before forming business barriers, can now be built by an ordinary engineer—or even myself—in a weekend. They’ve already lost their investment value.

YC’s project cycle is three months. This batch entered in December, plus pre-screening, means they were selected about five months ago. In the current AI iteration speed, that’s enough for several paradigm shifts to occur.

In 2012, when I first started my business and got a YC Fly Out (onsite interview invitation), YC was almost a monopoly in the accelerator track, and the companies they selected often represented “the next direction.” But the competitive landscape has changed. Over the past few years, YC has felt like a lagging indicator.

YC’s batch system—application, screening, entry, polishing, pitching—has been very successful in the mobile internet era. But this rhythm was designed for a slower world.

Reflecting on the past year and a half in venture capital, I’ve been to Silicon Valley roughly once every quarter. The last time was October last year. Previously, I felt change was rapid, but mostly perceptible month by month.

This time, it’s “weekly.”

One evening at dinner, a friend who works in post-training casually said:

“I’ve realized that even Silicon Valley is starting to fall behind itself.”

全民 token-maxxing：一场没人敢停的军备竞赛

Half a year ago, if someone told me that Meta’s tens of thousands of engineers were all coding with competitors’ products, I would think they were joking.

But it’s true. The entire Meta is using Claude Code. This isn’t a startup, not an experimental team, but a trillion-dollar company.

Code security is out the window, token budgets are exploding, leaderboards are heating up—Silicon Valley is throwing money into AI at all costs. But after the spending, what then?

First, code security. Half a year ago, this was unimaginable because code is a core asset of the company. How could an outside API access it? Meta initially thought so too. They developed something called myclaw internally to address this. A Meta friend told me they built a coding product, but “it was not user-friendly, no one used it.” After no one used it, the company relaxed the rules: as long as it doesn’t involve customer data, anyone can use Claude Code.

Then various departments started internal meetings on “how to become AI-native organizations,” doing training, assessments. Code security, operational safety—these once sacred red lines—were all pushed aside, with the goal of catching up on efficiency first.

For safety reasons, Google prohibits most employees from using Claude Code or Codex and other competitors’ tools, but DeepMind is an exception. Several teams responsible for the Gemini model and internal applications are using Claude Code.

Google has also made efforts: they launched an internal coding tool called Antigravity, and in February this year, claimed that about 50% of new code was AI-generated.

But even so, DeepMind’s people are still using Claude Code. One key reason is that Anthropic provided them with a private deployment, since their inference and training are primarily on Google Cloud’s TPU, establishing trust. But Meta and other tech giants don’t have this relationship—they’ve really thrown code security out the window. Everyone is betting on one thing: speed.

Code security is just the first flag to fall. The second is token budget.

In several AI-native startups I spoke with in Palo Alto, a top engineer’s annual token budget is around $200,000. This isn’t surprising in itself, but it means that the AI cost for a top engineer is approaching their salary. It looks like companies are using AI to cut costs—laying off people—but the overall costs may not have decreased; they’ve just replaced human costs with token costs.

Meta is the most extreme here. They created an internal token consumption leaderboard: those who use the most tokens top the list, and the lowest might be laid off. Meta employees even have an unofficial title: “Token Legend.”

Meanwhile, Meta has had two rounds of layoffs this year, totaling over ten thousand people. On one hand, everyone is pushing Claude Code to increase token volume; on the other, they’re downsizing massively.

These two aren’t contradictory—they are two sides of the same coin.

I visited a Series C company, and the CTO showed me Slack filled with agents running—dozens of Cursor agents in parallel, plus a Claude Code window scheduling tasks. The most common anxiety among programmers now is: before sleeping, not knowing what those agents are doing, and feeling panicked.

But has productivity really increased by that much? Since late last year, many CTOs from top reasoning engines and database companies excitedly told me about “hundredfold engineers” and “tenfold efficiency gains”: tasks that used to take 60 people a year can now be done by 2 people plus Claude Code in a week.

I was excited with them at first, but then I calmed down and asked: OK, efficiency is up 100 times, but has the company’s revenue increased 100 times? Or has the product line expanded 100 times? It’s unlikely that a “100x” improvement would just lead to layoffs.

I didn’t get a clear answer. The fact is, a 100x efficiency boost, when applied to revenue growth, only shows a 50% or 1x increase.

Where’s the gap? No one can clearly say yet.

“Using so many tokens, the company should have undergone a genetic mutation. But I don’t know what it’s turning into.”

A founder with B2B sales experience told me his team of 16, with two salespeople, went from zero to $30 million ARR in 12 months—entirely built with AI coding. Such cases are rare but visible. More often, I see startups building more things, but these don’t yet have product-market fit.

Silicon Valley is now experimenting with 100 approaches using vibe coding to see what works, rather than just 10. But who can grasp the next trend? It’s still very uncertain.

One of the most striking counterexamples comes from inside Anthropic. I asked an Anthropic friend: what’s the most painful scenario for your agents? He said it’s on-call (real-time response).

A typical on-call scenario: if Claude’s API suddenly slows down, a model inference node crashes, or a prompt outputs abnormally, on-call engineers need to quickly locate the problem, determine if it’s a bug, compute resource issue, or model anomaly, and then decide how to fix it.

Anthropic is the strongest company in coding agents, and this scenario is very close to their core capability. Yet, their internal on-call agents are still not very usable.

This is the real state in April 2026: the steam engine has been invented, but sometimes it’s still slower than a horse-drawn carriage. The key is, everyone knows the steam engine will eventually run faster, so they’re frantically pouring money into it: code security is ignored, token budgets explode, leaderboards heat up. But when will the steam engine truly surpass the horse? No one knows, but no one dares to stop and wait for that day.

Because the cost of stopping might be greater than burning the wrong tokens.

And token consumption probably isn’t growing linearly. This reminds me of my experience with autonomous driving: in 2021, in Shanghai, we achieved continuous 5-hour autonomous driving without intervention for the first time. It was a major breakthrough. Before that, the test fleet was slowly increasing from 10, 15, 20 cars; but after that inflection point, it quickly reached 100, then 1,000. Today’s coding agents are at a similar stage.

In Shanghai, in 2021, Didi’s autonomous driving achieved 5 hours of continuous, intervention-free driving—an industry milestone. The photo shows former Didi Autonomous COO Meng Xing, in conversation with Sebastian Thrun, “the father of Google’s self-driving cars,” in 2021.

METR, a California-based research institute evaluating AI coding ability, proposed a metric last year: how long an AI agent can complete a task with 50% success rate (based on human expert completion time). When first released in March 2025, Claude 3.7 Sonnet scored 50 minutes; by the end of 2025, Claude Opus 4.6 achieved 14.5 hours. Over the past two years, the doubling cycle of this metric has shrunk from 7 months to 4 months. Once agent reliability advances another step, token consumption won’t just increase by 50% annually—it could jump by an order of magnitude overnight.

A widely accepted prediction among friends: by the end of this year, many companies (including tech giants) will only need 20% of their staff.

After the Avalanche of xAI, rocket builders are now making models

In a steakhouse in Mountain View, late at night, a friend who worked with Elon Musk for a long time sat across from me. We talked for over three hours. Looking back, he didn’t seem to say a single good word about Musk.

A detail: I asked him, after three years at xAI, what’s your daily rhythm? He said he’s basically been living at the company, so his home is barely decorated, and he hasn’t even bought a bed. He sleeps in a “sleeping pod” at the office, similar to a hostel. I told him, now that you hold huge equity and have left the company, at least buy a bed. He just smiled.

xAI’s work intensity is notorious in Silicon Valley, but now about 90% of the early team has left. They have a departure group, and new people are constantly joining.

The trigger was Tony Wu’s dismissal, which caused a chain reaction. An insider told me, “Other companies might need half a year to see senior management leave, but xAI only needs a month.” Some sensed Musk’s dissatisfaction as early as October last year, but no one expected such a swift purge.

Now Musk is pulling people from SpaceX and Tesla to take over xAI—“rocket builders are now making models.”

Musk’s dissatisfaction stems from pouring countless funds and computing power into it, yet Grok has never entered the front line. Why? That’s a question every xAI person I’ve met asks. The answer is simpler than I thought: the team is very capable and works extremely hard, but the management style of manufacturing may not suit large model companies.

Having worked in autonomous driving for eight years, I have some personal insights. Musk’s past ventures—SpaceX and Tesla—are fundamentally systems engineering: long chains involving software, hardware, supply chains, each with room for innovation, but ultimately an end-to-end engineering problem.

He’s good at identifying key leverage points in these long chains and compresses timelines to solve problems. Cascading rocket engines, reusable landings—these are products of this thinking.

But at xAI, he’s not doing systems engineering. He’s doing three things: first, pouring a massive GPU cluster (now jokingly called a neo cloud, more than a neo lab), then setting pulse deadlines for the team, and personally designing some product features. This is about targeting specific points, not making a complete plan.

People in autonomous driving know that in later stages, the core conflict is “who leads whom” among software, infrastructure, and hardware teams. All three need CTO-level decision-makers, but no one understands all three domains simultaneously. The best approach is for founders to balance resources and set phased priorities—software first, then infrastructure. That’s what a global plan looks like.

xAI’s problem is the lack of this global plan—only sprints. If the pressure weren’t so high, smart people could self-correct, given time, and find their own collaboration rhythm. But Musk’s extreme pressure management, combined with the absence of a comprehensive plan, causes disintegration. Each leader defends their own priorities, with no one overseeing the whole.

The reason SpaceX and Tesla succeeded so well is partly because Musk rarely faced competitors of similar scale; he was competing with himself. But AI is different. It’s a fierce competition where even OpenAI could be poached by Anthropic.

One of xAI’s co-founders said last year he didn’t expect two things: first, how fierce the competition is; second, how few application innovations there are in the AI era—most are swallowed by models.

Anthropic’s rise is the most dramatic reversal in the AI industry over the past year. It has also completely shifted the battlefield: a year ago, everyone was competing over C-end user growth and video generation; now, the decisive battleground is B2B and coding.

Of course, xAI’s story is also about “too much money, too fast—what happens then.”

I believe that friends leaving xAI today won’t regret their decision. xAI has become Silicon Valley’s fastest wealth-creation myth. From its first round of hundreds of millions of dollars to today’s merger with SpaceX, forming a $250 billion giant, it took just one year. Nearly all nine co-founders became billionaires, with core engineers earning from tens of millions to over a hundred million dollars. There’s just too much money in Silicon Valley. If they start a new venture now, they have enough confidence to pursue their own interests, not just quick profits.

Anxious engineers, even more anxious researchers

Talking to engineers now, there’s a strange tacit understanding: everyone admits they don’t code much anymore, but they pretend it’s no big deal because they’ll be armed with AI and will eliminate those who aren’t AI-enabled.

Today, 80% of software engineers’ core skills have been replaced by models. The reason they still remain is that models sometimes make mistakes, and humans need to supervise. But even that “supervision” might soon be unnecessary.

A more radical thought: today’s so-called “AI-native organizations” sound sexy—streamlining workflows, digitizing tasks that can be AI-automated, turning skills into machine skills. But essentially, it’s human distillation: turning your capabilities into machine skills, so the company gains your skills, effectively AI-ifying itself. Whether this leads to layoffs is a moral question. Meta is doing this now.

Although everyone is now pushing token-maxxing, you can still feel a pervasive underlying anxiety across Silicon Valley.

What I didn’t expect is that this anxiety is spreading among researchers.

Researchers are the top-tier talent—not just “researchers,” but those responsible for training models and innovating algorithms at large model companies (OpenAI, Anthropic, DeepMind, etc.). They differ from engineers: engineers “build things,” write code, deploy, optimize; researchers “think about what to build”: proposing new training methods, designing architectures, running experiments to verify hypotheses.

Now, even researchers’ work is being automated. That’s what DeepMind colleagues are doing—using models to train models, a hot topic in AI self-evolution this year. By year’s end, engineers are being replaced, and researchers will start to be replaced too.

This isn’t a new concept. Andrej Karpathy’s auto-research (automated scientific research) pioneered this, and today, various AI scientist tools and harness frameworks are heading in this direction. But most current closed loops only reach the “publish paper” stage—AI helps run experiments and write papers, but humans still make judgments.

Companies like OpenAI, Anthropic, Google want to go further: they aim for a closed loop that directly upgrades models, not just fine-tuning but enabling AI to find paradigm-breaking breakthroughs itself. If successful, this would truly replace researchers. Over a year ago, Google DeepMind was experimenting internally, letting models decide what experiments to run, evaluate which paths are promising, and follow those—training the next generation of models themselves.

And researchers have more motivation to be laid off—because they’re expensive. Top researchers worldwide earn millions, even tens of millions, or hundreds of millions annually.

“The future might be that ten people do the work of a hundred, earning 20% of the pay, while ninety people are unemployed.”

And layoffs are even larger than the surface numbers suggest. Many companies’ first cuts aren’t on their own financial statements but on outsourcing vendors. This means India, the Philippines, and other countries that once provided customer service, data labeling, and back-office finance for Western companies might be the first hit. The “service industry ladder” that many developing countries relied on for economic upgrading could be being pulled out from under them by AI.

Silicon Valley is watching Meta. If their experiment succeeds—no revenue loss, real efficiency gains—other giants will follow quickly, and layoffs will become industry norm. And layoffs have a brutal self-reinforcing cycle: initially, no one dares to cut, fearing morale; but once it becomes normal, cuts accelerate and become less painful.

But as old roles are cut, new ones are emerging.

Many startups are hiring for a new role called “AI builder”—a hybrid of product manager, front-end engineer, and back-end engineer. Others combine data scientists and machine learning engineers into a hybrid role, or content operators who handle writing, distribution, and operations.

Demand for these new roles in Silicon Valley is very high, but the core challenge is: no one knows how to recruit them. You can’t filter with resumes because these roles didn’t exist before; their skills are often hidden in personal projects. You can’t test with live coding because the core skill is “aesthetic + AI usage” combo. Some startups are already doing this: automatically generating simulated environments based on employer needs, then testing candidates on AI tools in real time. It’s like a new kind of coding test, but for a whole new skill set.

When AI can do everything, human value shifts from “what you can do” to “what’s worth doing and what’s not.”

Two valuation multiples at a new funding round: Nvidia’s chips on every “table”

Having discussed so many replaced roles—engineers, researchers, finance professionals—there’s one role that not only remains untouched but is increasingly becoming the behind-the-scenes boss.

This seemingly distributed innovation world is actually highly centralized.

And that center is Nvidia.

I thought the scarcity of chips had eased over the past year. It did for a while. Around mid-2025, some neo cloud startups supported by Nvidia—specialized GPU cloud providers emerging in the AI wave—struggled to raise funds; some grew slowly or even sold. But now I see the scarcity is back, and worse than before.

A specific signal: if you can reliably provide an API service, like Claude’s API, with 99th percentile stability, you can charge two to three times the official API price.

After Anthropic’s demand surged, API outages are increasing, which is problematic for many Agent products built on Claude.

Previously, routing services were “I’m cheaper than official,” gaining traffic. Now, stability itself has become a scarce resource. Several startups are profiting from this, and Silicon Valley’s mini versions of Coreweave or Nebius are sprouting like mushrooms.

And this time, the bottleneck isn’t just GPU allocation. Elad Gil recently made a point I agree with: upstream memory manufacturers (Hynix, Samsung, Micron) need at least two more years to expand capacity. That means until 2028, no AI company can significantly widen the gap just by stacking more compute. The physical manufacturing cycle is too slow, reinforcing the oligopoly in large models—not because of lack of effort, but because of physical constraints.

The underlying power structure is clear: whoever has chips is powerful; Nvidia decides who gets chips. Today, publicly listed CoreWeave, Lambda, Nebius—all backed by Nvidia.

Nvidia’s strategy is deeper than I previously understood. Reflection’s investor mentioned that when this neo lab first raised funds, it was doing coding. Then the founder met Jensen Huang, who told him: “Stop coding. Come work on ‘America’s DeepSeek,’ open-source models for America. I’ll fund you and give you chips.” Reflection then pivoted 180 degrees.

This has led to some unusual structures in the US capital market: in the same funding round, two valuation tiers are assigned. Well-connected early investors get in at a lower valuation; Nvidia, with abundant cash, and latecomers are pushed into a higher valuation. This pattern has recently appeared domestically too.

But Nvidia can’t control what doesn’t exist.

Across the US, protests against data centers are escalating. About 100 data center projects nationwide face opposition, with 40 directly canceled. Maine just passed a law banning new data centers outright. A town approved a $6 billion data center project, but half the members were recalled overnight, replaced by new officials whose only goal was to revoke the decision.

Insufficient compute isn’t due to poor products or lack of users—it’s because the physical world can’t keep up with the digital world’s appetite.

This is another level of “falling behind.”

Silicon Valley’s valuation system is being rewritten

Let’s start with a number.

The US GDP is about $30 trillion. OpenAI and Anthropic each have an annualized revenue run rate of around $30 billion—that’s about 0.1% of US GDP. If both reach $100 billion by year’s end, plus cloud and other AI revenues, AI could account for roughly 1% of US GDP. From nearly zero to 1% in just a few years.

This speed is unprecedented. But strangely, the faster the growth, the more investors are unsure how to price it—causing Silicon Valley’s valuation framework to collapse.

I’ve had several deep conversations with friends in secondary markets. A recurring term is “re-rationalization” (valuation reversion).

In recent years, AI investments were based on future cash flow: it’s okay to lose money now, because you’re betting on ARR in three or five years. But this framework is breaking down.

The problem lies in the most basic valuation model: DCF (discounted cash flow). Normally, you forecast cash flow for 10 years, then add a terminal value, assuming the company will operate stably afterward, and bundle the remaining value. Usually, terminal value accounts for 70-80% of the total valuation.

But now, two things are changing: first, you might only forecast 3 years instead of 10, because after 3 (or sometimes just 1) years, the industry’s future is unpredictable; second, calculating terminal value is even more impossible. It assumes the company will eventually stabilize, but if AI can overturn everything at any moment, “stability” is no longer a valid assumption.

I told a friend in secondary investing a metaphor: companies not in the AI main track are like waiting for a “nuclear bomb”—you know they will be disrupted, just not when. So, your valuation shouldn’t focus on “what if they’re not disrupted,” but on “how fast can they respond when disrupted.” That’s a completely different valuation logic.

SaaS was the first to be re-priced by Wall Street. Snowflake in 2023, based on free cash flow, would need nearly 100 years to break even. Now, its valuation has halved. ServiceNow, Workday—similar trends. This is just the beginning.

In fact, only leading large model companies might still be suitable for DCF valuation, because their future seems relatively stable and upward—less likely to be “blown up,” more about how wide their boundaries can expand.

In the past, startups justified lower wages with “offering options that could be worth a lot in the future.” But that premise assumes the company will still be valuable in 15-20 years. If that’s no longer true, the most rational employee response might be: “Just give me cash now, no options.”

This, in turn, would change the company’s cost structure and financing logic.

Venture capital is also suffering. Over the past 3-6 months, nearly every fund in Silicon Valley has invested in at least one neo lab—researchers from famous AI labs raising hundreds of millions based on their ideas. But now, everyone feels it was impulsive and expensive. Why did they still invest? Because if that company actually succeeds, its growth will be so fast that the initial valuation seems cheap.

A friend investor said frankly: “It’s either zero to 100, or zero to zero. Instead of earning ‘hard money’ from a costly Series A, better to bet on a neo lab with unlimited potential.”

Previously, everyone thought 1 dollar ARR equals 1 dollar valuation, regardless of whether it’s models, applications, or infrastructure. But now, that equivalence is broken.

Vertical agents have the lowest multiples (around 5x ARR), general agents higher (around 10x), and models the highest (20-30x ARR—for example, Anthropic’s $30B ARR at an $800B valuation, 26.7x). A year ago, I thought applying a uniform multiple to ARR was enough, but today, that’s completely wrong.

Orange Trees and the AI Assassination List

Silicon Valley is experiencing a deep crisis of security.

On this trip, I kept hearing friends seriously discussing: buying Bitcoin, building bunkers, installing bulletproof glass at home—they weren’t joking.

Recently, it’s become popular to plant orange trees because their branches have 4-inch thorns, making any intruder pay a heavy price.

The Wall Street Journal even reported a $15 million “fortress mansion”: concrete planters with orange trees, behind a moat, with laser intrusion detection, a front door with 3-inch solid steel plates and 13 locks, and a safe room with a 2,000-pound door, all designed for defense.

Companies providing residential security for CEOs are seeing their highest growth since 2003. Especially after the CEO of UNH was shot dead on Manhattan street, this trend accelerated sharply.

Then, the gunfire reached the homes of AI giants.

On April 11, at 4 a.m., a 20-year-old in a Champion hoodie flew from Texas to California, carrying a kerosene can, and stood in

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

Reward
like
Comment
Repost
Share

Comment

Add a comment

No comments

Trending Topics
View More
#
WCTCTradingKingPK
282.39K Popularity
#
比特币Breaks79K
11.68M Popularity
#
CryptoMarketsRiseBroadly
90.24K Popularity
#
WHCADinnerShootingIncident
15.83K Popularity
#
IranProposesHormuzStraitReopeningTerms
286.12K Popularity

Sitemap

Everyone is token-maxxing, an arms race no one dares to stop.

Trending Topics

WCTCTradingKingPK

比特币Breaks79K

CryptoMarketsRiseBroadly

WHCADinnerShootingIncident

IranProposesHormuzStraitReopeningTerms

Pin