Futures
Access hundreds of perpetual contracts
TradFi
Gold
One platform for global traditional assets
Options
Hot
Trade European-style vanilla options
Unified Account
Maximize your capital efficiency
Demo Trading
Introduction to Futures Trading
Learn the basics of futures trading
Futures Events
Join events to earn rewards
Demo Trading
Use virtual funds to practice risk-free trading
Launch
CandyDrop
Collect candies to earn airdrops
Launchpool
Quick staking, earn potential new tokens
HODLer Airdrop
Hold GT and get massive airdrops for free
Pre-IPOs
Unlock full access to global stock IPOs
Alpha Points
Trade on-chain assets and earn airdrops
Futures Points
Earn futures points and claim airdrop rewards
Promotions
AI
Gate AI
Your all-in-one conversational AI partner
Gate AI Bot
Use Gate AI directly in your social App
GateClaw
Gate Blue Lobster, ready to go
Gate for AI Agent
AI infrastructure, Gate MCP, Skills, and CLI
Gate Skills Hub
10K+ Skills
From office tasks to trading, the all-in-one skill hub makes AI even more useful.
GateRouter
Smartly choose from 30+ AI models, with 0% extra fees
The AI era, the ultimate projection of token supply and demand wars
Editor’s Note: Against the backdrop of continuous leaps in AI model capabilities and tools like Claude Code and Cursor being widely adopted by enterprises, industry discussions are shifting from “how powerful are the models” to “how models enter production.” But as AI programming, automation analysis, and data modeling gradually become the new consensus, a deeper question begins to surface: when execution costs are rapidly lowered, what is truly scarce—labor, capital, or the rights to use cutting-edge models and tokens?
This article is compiled from a conversation between Patrick O’Shaughnessy and Dylan Patel, founder of SemiAnalysis. Dylan has long focused on AI infrastructure, semiconductor supply chains, and model economics. In this dialogue, he starts from the surge in his company’s Claude Code expenses, discussing how AI is transforming enterprise organization, information services, token demand, compute supply chains, and social sentiment.
What’s most noteworthy in this conversation isn’t another model breaking benchmarks, but rather a way to understand the AI economy—viewing AI as a production system that is redistributing execution power, organizational efficiency, and industry profits, rather than just a software tool upgrade.
This conversation can be roughly understood from five perspectives.
First, the cost of execution has been broken through. In the past, ideas weren’t scarce; the real difficulty was turning ideas into products, systems, and deliverable services. Now, Claude Code enables non-technical personnel to write code, build applications, and analyze data—tasks that once required long-term maintenance by a team are now completed by a few people leveraging models. SemiAnalysis’s annualized expenditure on Claude Code has reached $7 million, over a quarter of its salary expenses, indicating that AI is no longer just a productivity tool but becoming a new form of enterprise capital.
Second, the information service industry is the first to be rewritten. Dylan’s business essentially sells analysis, consulting, and datasets, which are among the easiest fields to commodify with AI. Reverse engineering chips, modeling energy grids, building macroeconomic indicators—tasks that previously required long-term team effort can now be completed by a few people in weeks. This means AI’s pressure on info service companies isn’t “will it replace humans,” but “who can redo competitors’ products faster.” Companies not adopting AI will be commoditized faster, and those using AI must keep raising standards to avoid being displaced by more efficient competitors.
Deeper still, tokens are becoming new means of production. Previously, companies bought software subscriptions, and the core issue was whether the tools were good enough; now, access rights to cutting-edge models, rate limits, enterprise contracts, and token budgets directly determine production capacity. Stronger models don’t necessarily mean higher costs, because smarter tokens can accomplish higher-value tasks with fewer steps. The real competition is shifting from “who uses AI” to “who can access the strongest models and allocate the most expensive tokens to the highest-value scenarios.”
This demand will continue to propagate along the entire supply chain. The surge in token usage will eventually exert ongoing pressure on GPUs, CPUs, memory, FPGAs, PCBs, copper foil, semiconductor equipment, and wafer factories. The “bullwhip effect” mentioned in the article exemplifies this: seemingly just increased model calls downstream can translate into orders, capacity expansion, and price hikes magnified several times upstream. Profits in the AI industry will thus not only be shared among model companies and NVIDIA but will spill over along the semiconductor and data center supply chains.
Finally, AI’s societal backlash may arrive earlier than expected. As AI truly enters workflows, public concerns about job displacement, energy consumption, data center expansion, and power concentration will rise in tandem. Dylan even predicts large-scale protests against AI could erupt within three months. For model companies, continuing to emphasize “AI will change the world” may not ease anxiety but could reinforce the public’s sense of losing control. The AI industry will need to demonstrate not just technical prowess but how it creates tangible, perceptible public value in the present.
Today, the core issue of AI is shifting from “what can models do” to “who can access models, how to use them, and who can capture the value they generate.” In this sense, the discussion isn’t just about Claude Code, Anthropic, or a single AI company, but about a structural reordering around productivity, capital expenditure, organizational efficiency, and social acceptance.
Below is the original content (reorganized for readability):
TL;DR
The core variable of AI is shifting from “can it do” to “is it worth doing.” After the rapid drop in execution costs, what is truly scarce are high-value ideas that can be amplified by models.
Claude Code expenses accounting for 25% of salary costs is just the beginning; AI is transforming from a software tool into a new enterprise capital.
Competition among frontier models is no longer just capability-based but access-based; whoever can access the strongest models earliest and most stably may form new business barriers.
The info service industry will be the first to be restructured by AI, as the costs of data, analysis, and research production are rapidly declining, and slower companies will be commoditized faster.
Token demand won’t slow down due to falling prices of older models, because each new model strength releases new high-value use cases, pushing users toward more expensive, cutting-edge models.
The biggest change brought by AI isn’t reducing human work but enabling a few to produce multiple times more in the same time; those who cannot create and capture token value will be locked in a “permanent bottom layer.”
Compute shortages are spreading through the entire semiconductor supply chain, from GPUs, CPUs, and memory to PCB, copper foil, and equipment manufacturers; AI demand is becoming a price driver across the industry.
AI’s economic value is hard to capture in traditional GDP metrics; the real question isn’t just how much model companies earn, but how much “ghost GDP”—the decision-making, efficiency, and chain reactions driven by token generation—has been created.
Interview Transcript:
Claude Code Becomes the New Workforce
Patrick O’Shaughnessy (Host):
You told me a fascinating story about the huge change in your team’s token usage this year. Can you tell it again? What does it reveal about what’s happening in the world?
Dylan Patel (Founder of SemiAnalysis):
Last year, we thought we were already heavy users of AI. Everyone was using ChatGPT, everyone was using Claude, and I was providing my team with all kinds of subscriptions they wanted. At that time, our spending was only a few tens of thousands of dollars.
But this year, expenses started skyrocketing. The real starting point was around late December last year, with the emergence of Opus. This also includes Doug, our President Douglas Lawler. He basically led the push for non-technical staff to write code with AI. He gradually brought the whole company into it. Of course, engineers were already using it, but from January this year, our expenses clearly turned upward and then exploded rapidly.
Later, we signed an enterprise contract with Anthropic. Last time I talked to you, our annualized expenditure was about $5 million; now it’s $7 million.
Patrick O’Shaughnessy:
And that’s just last week’s number.
Dylan Patel:
Yes, a large part of it is usage itself. What’s really interesting is that people who never wrote code before are now using Claude Code, and some are spending thousands of dollars a day. But overall, our annual expenditure on Claude Code has reached $7 million, while our total salary expenses are about $25 million. That means Claude Code costs are over 25% of our salary bill.
If this trend continues, by the end of the year, it might even surpass total salaries. That’s a bit scary. Fortunately, I no longer need to choose between “people” and “AI,” because the company is growing fast. It’s more like: I don’t need to hire quickly, but I can spend more on AI, and it really works—our growth accelerates.
But I think other companies will eventually face the same question: if one person using Claude Code can do the work of five, ten, or even fifteen people, what then? First, maybe they should lay off some staff; second, these use cases are very broad.
For example, we have an R&D lab in Oregon that’s been running for a year and a half. It has high-end equipment like microscopes and SEMs. Its core purpose is reverse engineering chips, extracting architectures, analyzing manufacturing materials—these are also some of the data we sell.
But analyzing such data used to be very slow. Now, one person in our team spent only a few thousand dollars in Claude tokens and built an application. This app can accelerate GPU processing, running on servers we host at CoreWeave. We just send it a chip image, and it automatically marks each material’s location: copper here, tantalum there, germanium here, cobalt there. Then, you can perform finite element analysis on the entire chip stack very quickly, with visualization, a full graphical interface, and dashboards.
This person used to work at Intel. He said that in the past, this would have been a full team’s job to do and maintain. Now, it’s almost unbelievable to see it done across the whole company.
Another example I find particularly interesting is Malcolm. He used to be an economist at a big bank. That bank’s economics department probably had 100 to 200 people. Now, he’s produced astonishing results.
He integrated various data sources, including FRED, employment reports, and APIs from other providers. We also signed contracts with some data vendors for API access. He pulled all the data in, ran regressions, analyzed how different economic changes impact inflation or deflation.
The U.S. Bureau of Labor Statistics has about 2,000 task categories. Malcolm used AI to evaluate: which tasks can now be done by AI, which cannot, and scored them based on a rubric. The result: about 3% of tasks are now AI-complete.
He created an index to measure which things can be done by AI and the deflationary impact when they are. Output might increase, but because costs drop so much, GDP could actually shrink—he calls it “Phantom GDP.”
He built a whole analytical framework around this concept, including a new benchmark with about 2,000 evaluations.
Patrick O’Shaughnessy:
All of this was done by him alone?
Dylan Patel:
Yes, all by himself. He told me, “Bro, this would have taken a 200-person economist team a year to do.” Now he’s fully immersed in Claude, saying everything has changed.
Patrick O’Shaughnessy:
As a business operator, how do you interpret this? You went from almost no expenditure to nearly 25% of salaries, and it’s still rising. At what point do you think: wait, should I hit the brakes? Control costs? Maybe we don’t always need the latest frontier models like Opus 4.7, but can switch to cheaper ones?
Dylan Patel:
Ultimately, I do information business. We sell analysis, consulting, and datasets. I see no reason why these can’t be fully commoditized at a pretty fast pace.
If I don’t keep improving, the first data product I sold might now be replicated by others. We can still sell because we keep making it better and more detailed. But the way we do it in 2023 is not that different from how others are doing it now. If I don’t raise standards, I’ll be commoditized. If I don’t move fast enough, I’ll lose advantage.
So the question is: yes, AI commoditizes many things, just like it commoditizes software. But those who act fast enough, who maintain customer relationships, provide excellent service, and keep improving, will not shrink—they will grow faster. Those who are incompetent or do nothing will lose.
It’s a survival issue: if I don’t adopt AI, others will, and they will beat me.
Another simple example is in energy. Over the past year, we had a few energy analysts trying to build an energy model. It’s very complex, and the energy data market is about $900 million—definitely a huge market I want to enter. But even after a year, we haven’t truly penetrated the energy data business.
Then, “Claude Code madness” hit. Jeremy, who manages data center energy and industrial business, started using Claude Code, and everything changed. In three weeks, he spent a lot—about $6,000 a day, which is crazy—but he mapped every power plant and high-voltage transmission line in the U.S., built a map of the entire U.S. grid from public data, and integrated many demand-side data sources.
We turned it into a dashboard to view and analyze power shortages and surpluses across regions, with many details. It was built in just a few weeks.
Later, we showed it to some clients who already bought our data center datasets, including energy traders. They said, “Wow, how long did this take? It’s very good—better than some company that’s been doing this for ten years with 100 people.”
Of course, our current product isn’t as complete or robust, but in some aspects, it’s better. So I’m now commercializing these energy data services. But if I don’t move faster, who will commoditize me?
From a business owner’s perspective, it’s not just about “spending a lot.” Yes, I spent a lot. But what did that money bring me? Did it generate more revenue? If yes, then it’s worth it.
Patrick O’Shaughnessy:
Are you worried that, in the end, those controlling capital and responsible for investments—those who hire you—will say, “We have analysts too, and they’re smart. Why not do it ourselves?” If this becomes too easy, won’t it all flow back into investment firms, since they can leverage these insights the most?
Dylan Patel:
First, any info service business is like that: the value I get from a piece of information is obviously less than what the client gains from it.
If I sell you information for $1, you’re willing to pay because it helps you make a decision that earns you more than $1. You’re arbitraging. The money you make from me exceeds what I earn from selling the info.
Fundamentally, investment firms also have their own info capabilities. Especially firms like Jane Street, Citadel—they have very detailed, deep data analysis. But they still buy our data, and continue to do so, and our cooperation is growing.
I think there’s some “it factor” here. We act faster, more flexibly, with a smaller team, focusing on a very specific field: AI infrastructure and the huge transformations it triggers, including AI, token economy, and related areas. We see the direction earlier and build faster.
So, professional investors will try to do some of what we do themselves. But more often, they buy our data and build on it. For them, buying our data and doing further work is usually cheaper than building from scratch. Of course, eventually some will try to do it themselves.
Tokens as New Means of Production
Patrick O’Shaughnessy:
Every time I talk to you, I end up returning to the same question: token supply and demand. That’s what interests me most right now. Based on your experiences, do you have new insights into demand? When you personally feel this acutely, has your judgment on token demand changed?
Dylan Patel:
If we take a step back and look at the macro, Anthropic’s ARR might have grown from $9 billion to around $35-40 billion. By the time this episode airs, it could be $40-45 billion.
But their compute growth hasn’t matched that. If you do the math, assuming they haven’t cut R&D compute—which they clearly haven’t, since they keep releasing new models like Metis, Opus 4, Opus 4.7—that means their added compute, even if fully dedicated to inference, keeps their gross margin at a minimum of about 72%.
In reality, some of the new compute likely goes into R&D, so their actual gross margin could be even higher. Remember, earlier this year, some leaked parts of their funding documents showed a gross margin of just over 30%.
How can a business raise its gross margin so quickly in such a short time? Principally, demand is extremely high. They can tighten usage quotas, rate limits, and restrictions. The key is having a dedicated account manager, enterprise contracts, and the ability to get rate limit increases. Without that, tokens will become extremely scarce.
Who can pay? They get it. Anthropic faces the same issue—though it’s just the reality of capitalism. Yes, clients might pay $40 billion annually for tokens, but the value created by those tokens far exceeds that.
Different companies derive different value from each token. But as models become smarter, what really matters is: who can access these smartest tokens and use them on the most valuable tasks?
As an individual, you decide how to use these tokens to grow your business and create value. Many want tokens and will consume them. But ordinary SaaS startups in San Francisco generating software with Claude may not create huge value. So eventually, their token prices will push them out.
Patrick O’Shaughnessy:
I encountered this myself today on my flight. When Opus 4.7 was released, I immediately wanted to use it—right away. But I was rate-limited and couldn’t. I can’t even imagine continuing with 4.6 anymore, even though I was very satisfied with it in recent weeks.
Are you surprised that people are so eager to use the most expensive, cutting-edge models?
Dylan Patel:
Not at all. One of the funniest memories in the past month and a half was me and my friend Leopold practically begging the co-founder of Anthropic for access to Metis.
We knew it existed, so we said, “Please, let us try it.” He said, “I don’t know what you’re talking about.”
Patrick O’Shaughnessy:
When the pricing or eval card comes out, what’s your reaction?
Dylan Patel:
Actually, there were rumors in the Bay Area before, and we knew it would be very powerful. Benchmark results are always changing, but Mephisto / Metis probably represents the biggest leap in model capability in the past two years.
I think it’s very important: it’s so strong that Anthropic doesn’t even want to fully release it. Although they’ve announced prices to some clients and done selective releases—targeting cybersecurity scenarios—the token cost might be 5 to 10 times higher, but they still hold back from full release, worried about real-world impacts.
What they give us now is a less capable, weaker version—Opus 4.7. And they explicitly state in the model card that they’ve intentionally made it worse in cybersecurity capabilities. Not sure if you read that part.
So I’d say: whoever you are, as long as you have enough capital, you should buy Anthropic’s enterprise subscription, pay by tokens, not the regular plan. That way, you won’t be as easily rate-limited.
And you must think carefully about how to allocate these tokens to the highest-value tasks and profit from them. Because fundamentally, maybe in a year or two, many businesses will be doing token arbitrage. Tokens are powerful, but the key is knowing where to direct them.
In three or four years, models might even know how to use tokens themselves to maximize value.
Looking back at any benchmark, you’ll see that reaching a certain capability level used to cost X, now it might only be a fraction—one percent or even a thousandth of that. For example, reaching GPT-4 level capability with DeepSeek costs about one-sixth of GPT-4. And costs for GPT-4 level models keep falling.
Of course, nobody really cares about GPT-4 anymore. What people want are frontier models, because only they can create truly economically valuable outputs. Still, GPT-4 models can be used in some smaller scenarios.
The real demand driver isn’t cheaper old capabilities but the emergence of new use cases. Today, you’re using Opus 4.6 or 4.7. A year from now, to get the same quality model, your expenditure might be only $70k—100 times cheaper.
But that’s not the point. Because then, I’ll definitely be using a stronger model to do more valuable things.
Anthropic’s Metis is more expensive as a model but consumes far fewer tokens to accomplish the same tasks. So in most cases, it’s actually cheaper than Opus 4.6.
Dylan Patel:
Because it’s much more efficient. Even though each token is smarter and more expensive, it requires fewer tokens to complete tasks.
Patrick O’Shaughnessy:
When I last saw you, Metis had just been released, or the model card just came out. You said it was so powerful it made you a bit afraid. What did you mean?
Dylan Patel:
Anthropic’s goal for 2025, even starting from 2024, is to have a model with the capabilities of an L4 software engineer by the end of 2025. Overall, they’ve basically achieved that with Opus 4.6.
But what they didn’t say is that, compared to Metis, which is even better, it’s more like an L6 engineer. L4 is a relatively junior software engineer, while L6 is quite experienced.
I remember Anthropic said internally it was available from February. That’s just two months to go from L4 to L6. What happens next?
When you think about model progress, you realize it’s actually accelerating. Anthropic’s release pace is compressing, and so is OpenAI’s. Why? Because to make better models, you need a few things.
First, you need massive compute. It’s very expensive and has its own timeline. We track these, and they are indeed growing, but in the short term, it’s mostly fixed—contracts are signed, and the capacity is set. There might be some delays or adjustments, but overall, it’s stable.
Second, you need top-tier researchers. Companies are paying tens of millions for these.
Third, you need implementation capability. Historically, this has been very hard. If I have an idea, I still need to implement it, which is tough. But now, ideas are everywhere, and implementation is very easy. It’s expensive but very easy.
So the question becomes: how does a person decide which ideas to implement? When implementation becomes so easy, you can realize more ideas and run faster on this treadmill.
This can happen in AI research, shortening the release cycle from six months to two months. It can also happen in other fields. For example, I want to model every power plant and transmission line in the U.S., run regressions, analyze regional supply and demand—now I can do that too.
Ideas are cheap. The key is: which ideas matter? Which are worth investing in, buying tokens for, and realizing? Because the ability to implement is already there. That’s the most critical change.
If the cost of implementation continues to fall—and it’s indeed falling—we haven’t even fully obtained Metis yet. Opus 4.7 was just released a few hours ago, but our team is already very excited.
What will this bring to the world? I think it will reorder how the economy operates.
In the past, execution was very important because it was hard; ideas were cheap. Now, ideas are not only cheap but abundant, and execution has also become very easy. So, only the best ideas—those that can prove, even when implementation is extremely cheap, that they are worth spending on—will survive.
Patrick O’Shaughnessy:
So, are you truly afraid? Or is it just a kind of uncertainty that’s hard to grasp?
Dylan Patel:
Uncertainty definitely exists. But I do feel it will bring some fear. The question is: how will society reorganize itself?
When you live in a world where “ability to realize something” is no longer so important, what becomes important? The ability to choose the right ideas for AI, to sell those ideas, or to sell what AI produces, and to raise capital for those directions. These will become the new priorities.
This also circles back to the earlier question: owning the latest models is always crucial. Who can access the latest models?
Anthropic has a project—I know it’s not called Earwig, but I like to call it Earwig, as a playful nickname to tease Anthropic. They only provide Metis to certain companies for cybersecurity scenarios. I think this kind of thing will continue: model deployment will become more narrow, less for the general public.
I know OpenAI, Anthropic, and others say they want everyone to have powerful AI. But AI is very expensive. Who will pay for the infrastructure worth trillions? Those with money, and who can build useful things with AI.
And you don’t want others to distill your model, so you won’t release it widely. You’ll give it to fewer and fewer clients. And those clients will start competing for tokens.
Unless Anthropic raises prices significantly. They could double Opus’s price, and I’d still pay. I dare say most users would keep paying. But I think that even then, it wouldn’t solve their massive capacity issues.
So the question becomes: where does this cycle end? As token usage and the value they generate become more concentrated in a few companies, what will happen?
I don’t have Metis now. But who does? Top banks do. They might only use it in cybersecurity now, but I can imagine a world: because I have an enterprise contract with Anthropic, and they like me, they might give us earlier access or higher rate limits. I’d love that.
And then my competitors without such access? I could beat them.
Or it could be another scenario. For example, Ken Griffin at Citadel, with strong connections and deep pockets, might negotiate with OpenAI or Anthropic: “I’ll buy $10 billion worth of tokens annually. Every time you release a new model, I’ll buy the first $10 billion, and others can use what’s left.”
What would happen? He could dominate the market.
That’s just one example. It could also happen in cybersecurity—Anthropic might worry models could make hacking easier. Or in info services like me, using it to outcompete others.
I think the impact range is huge. We don’t really know what these models can do. Anthropic doesn’t know, OpenAI doesn’t know—no one knows. Ultimately, it’s up to end users to discover: where can these tokens be used? What can they build? What can they imagine?
This will greatly boost productivity and has positive aspects for humanity. But the concern is: how will resources and usage rights concentrate?
Robots Will Meet the Next Wave of Demand
Patrick O’Shaughnessy:
Right now, the tokens consumed by robots—or robotics—are almost negligible compared to other fields. What’s your view? Could it become a second demand curve? Every day, within a mile radius, new robot startups emerge trying to do interesting things.
Dylan Patel:
There’s a concept called “software-only singularity.” That is, the world might first see an AI singularity that only happens in software. But the problem is, most of the world is still physical. Eventually, the world will organize around hardware, not just software. So I think the “software singularity” will be a short phase, not the end. Because we will still enter the physical world.
Once software becomes very easy, what’s the hardest part of robotics? Programming, microcontrollers, actuators, and controlling all these. These are still very difficult today.
AI models have an interesting trait: their learning efficiency is quite low. They only learn because we give them massive data, and in some areas, they surpass humans.
But current robot models, like VLA (Vision-Language-Action), are very popular but might not be the ultimate solution. They’re data-inefficient, and we can’t scale robot data fast enough.
In the future, there will be ways to pre-train robot models at scale—just like humans see all kinds of data throughout life. Humans are very sample-efficient: one or two examples, and we learn.
If this ability applies to robots, the situation changes entirely. Once a software singularity occurs, implementation becomes very cheap, anyone can start building these models, and real useful robots can be created.
So I believe in the next 6 to 18 months, we’ll see real breakthroughs in robotics. The key is few-shot learning. There will be a pre-trained robot model, and you just show it a few examples, and it can do the task.
Tell it to stack these two objects, and it will. Tell it “this can stay balanced,” and it will try and succeed. Believe me, I’ve knocked things over many times myself.
So I think robots will develop few-shot learning capabilities.
Many companies are already working on robots—some for advertising, some for simple tasks. But the future will be very segmented: robots just for folding clothes, or more specialized ones for cleaning blackboards. They might be rental services or model packages you download onto standard robots, and they perform tasks for pay-per-use.
In any case, the physical goods industry will accelerate greatly, creating deflationary effects. And this will continue to drive token demand to grow wildly. I personally don’t think token demand will slow down.
Patrick O’Shaughnessy:
From Metis’s results and how it’s built, have you learned anything new about the world? For example, if we break down the components of scaling laws, like pretraining…
Dylan Patel:
It’s a much larger model than before. 100k blocks of Blackwell, equivalent to tens of thousands of chips in the previous generation. Of course, TPU and Triton have different release rhythms, so it’s not a perfect comparison. But overall, yes, Metis is a significantly bigger model. It proves that scaling laws still hold. Everything it shows indicates the trend continues: invest more compute into models, and they get better.
And throughout this process, we’re also continuously improving computational efficiency. All R&D compute in labs eventually translates into one thing: to reach a certain capability level, the cost six months ago was X, now it’s a fraction—one percent or even a thousandth. For example, reaching GPT-4 level with DeepSeek costs about one-sixth of GPT-4. And costs keep falling.
Of course, nobody really cares about GPT-4 anymore. What they want are frontier models, because only they can generate truly valuable economic outputs. But GPT-4 models are still useful for smaller scenarios.
So the real demand driver isn’t old capabilities getting cheaper, but new use cases constantly emerging. Today, you’re using Opus 4.6 or 4.7. A year from now, to get the same quality, your cost might be only $70,000—100 times cheaper.
But that’s not the point. Because then, I’ll be using a stronger model to do more valuable things.
Anthropic’s Metis is more expensive as a model but consumes far fewer tokens to do the same. So in most tasks, it’s actually cheaper than Opus 4.6.
Dylan Patel:
Because it’s much more efficient. Even though each token is smarter and more expensive, it takes fewer tokens to complete tasks.
Patrick O’Shaughnessy:
When I last saw you, Metis had just been released, or the model card just came out. You said it was so powerful it made you a bit afraid. What did you mean?
Dylan Patel:
Anthropic’s goal for 2025, even starting from 2024, is to have a model with the capabilities of an L4 software engineer by the end of 2025. Overall, they’ve basically achieved that with Opus 4.6.
But what they didn’t say is that, compared to Metis, which is even better, it’s more like an L6 engineer. L4 is a relatively junior engineer, while L6 is quite experienced.
I remember Anthropic said internally it was available from February. That’s just two months to go from L4 to L6. What happens next?
When you think about model progress, you realize it’s actually accelerating. Anthropic’s release pace is compressing, and so is OpenAI’s. Why? Because to make better models, you need a few things.
First, you need massive compute. It’s very expensive and has its own timeline. We track these, and they are indeed growing, but in the short term, it’s mostly fixed—contracts are signed, and capacity is set. There might be some delays or adjustments, but overall, it’s stable.
Second, you need top-tier researchers. Companies are paying tens of millions for these.
Third, you need implementation capability. Historically, this has been very hard. If I have an idea, I still need to implement it, which is tough. But now, ideas are everywhere, and implementation is very easy. It’s expensive but very easy.
So the question becomes: how does a person decide which ideas to implement? When implementation becomes so easy, you can realize more ideas and run faster on this treadmill.
This can happen in AI research, shortening the release cycle from six months to two months. It can also happen in other fields. For example, I want to model every power plant and transmission line in the U.S., run regressions, analyze regional supply and demand—now I can do that too.
Ideas are cheap. The key is: which ideas matter? Which are worth investing in, buying tokens for, and realizing? Because the ability to implement is already there. That’s the most critical change.
If the cost of implementation continues to fall—and it’s indeed falling—we haven’t even fully obtained Metis yet. Opus 4.7 was just released a few hours ago, but our team is already very excited.
What will this bring to the world? I think it will reorder how the economy operates.
In the past, execution was very important because it was hard; ideas were cheap. Now, ideas are not only cheap but abundant, and execution has also become very easy. So, only the best ideas—those that can prove, even when implementation is extremely cheap, that they are worth spending on—will survive.
Patrick O’Shaughnessy:
So, are you truly afraid? Or is it just a kind of uncertainty that’s hard to grasp?
Dylan Patel:
Uncertainty definitely exists. But I do feel it will bring some fear. The question is: how will society reorganize itself?
When you live in a world where “ability to realize something” is no longer so important, what becomes important? The ability to choose the right ideas for AI, to sell those ideas, or to sell what AI produces, and to raise capital for those directions. These will become the new priorities.
This also circles back to the earlier question: owning the latest models is always crucial. Who can access the latest models?
Anthropic has a project—I know it’s not called Earwig, but I like to call it Earwig, as a playful nickname to tease Anthropic. They only provide Metis to certain companies for cybersecurity scenarios. I think this kind of thing will continue: model deployment will become more narrow, less for the general public.
I know OpenAI, Anthropic, and others say they want everyone to have powerful AI. But AI is very expensive. Who will pay for the infrastructure worth trillions? Those with money, and who can build useful things with AI.
And you don’t want others to distill your model, so you won’t release it widely. You’ll give it to fewer and fewer clients. And those clients will start competing for tokens.
Unless Anthropic raises prices significantly. They could double Opus’s price, and I’d still pay. I dare say most users would keep paying. But I think that even then, it wouldn’t solve their massive capacity issues.
So the cycle becomes: where does it end? As token usage and the value they generate become more concentrated in a few companies, what will happen?
I don’t have Metis now. But who does? Top banks do. They might only use it in cybersecurity now, but I can imagine a world: because I have an enterprise contract with Anthropic, and they like me, they might give us earlier access or higher rate limits. I’d love that.
And then my competitors without such access? I could beat them.
Or it could be another scenario. For example, Ken Griffin at Citadel, with strong connections and deep pockets, might negotiate with OpenAI or Anthropic: “I’ll buy $10 billion worth of tokens annually. Every time you release a new model, I’ll buy the first $10 billion, and others can use what’s left.”
What would happen? He could dominate the market.
That’s just one example. It could also happen in cybersecurity—Anthropic might worry models could make hacking easier. Or in info services like me, using it to outcompete others.
I think the impact range is huge. We don’t really know what these models can do. Anthropic doesn’t know, OpenAI doesn’t know—no one knows. Ultimately, it’s up to end users to discover: where can these tokens be used? What