How can the AI application layer find sustainable monetization space?

Question

Consumer-grade AI subscription models are caught in a dilemma: token costs keep rising, but users’ willingness to pay is hard to grow at the same pace. This structural tension makes the business model extremely fragile. More sustainable AI commercialization is likely to happen in scenarios involving high-value users, deep workflow lock-in, and linkage with real business outcomes. This is the economic premise for the emergence of the vertical AI track. In issue 9 of “Agentic Economy,” we break down three cases—Harvey, Farther, and Adyen—to see how they build competitive advantages in a context where foundation models are being commoditized. We also directly tackle two harder questions: when token subsidies fade, do these advantages still hold? And what does it mean when OpenAI and Anthropic start assigning engineering teams to enterprise clients?

Once any technology becomes sufficiently widespread, it stops generating premiums. General-purpose large models are no exception. As the marginal effects of parameter competition diminish, the once-scarce intelligence is rapidly evolving into commoditized public infrastructure. This is causing the business models of lightweight applications—ones that rely solely on calling third-party APIs and lack the ability to embed into specific scenarios—to collapse at an accelerating pace.

But commoditization is never the end. Each time an emerging technology becomes mainstream, the value opportunity shifts from those who possess the technology to those who can truly deploy it in real-world applications.

This pattern is now driving the rapid rise of vertical AI platforms.

So-called vertical AI platforms refer to AI applications that go deep into specific industries, deeply encapsulate the capabilities of general-purpose foundation models, and then reorganize business processes around particular workflows. By building proprietary evaluation systems (Eval) and multi-agent architectures, these platforms are downgrading the underlying foundation models into computing components that can be replaced at any time—thereby locking the industry’s most core workflow assets securely inside the system. Their key idea is to eliminate friction in business processes, turning complex professional work into sustainable, accumulative system assets.

To understand how this new track comes into being, we must first clarify a basic fact: what enterprises and professionals pay for is never the parameter scale of the underlying model, but its ability to deeply embed into internal workflows, form a data closed loop, and drive actual revenue.

For this reason, high-paid legal experts, wealth management advisors serving high-net-worth clients, and top merchants holding enormous transaction throughput are becoming the focal points of the competition for the next generation of vertical AI platforms. These people hold budget decision-making power, bear compliance responsibilities, and are oriented toward clear business outcomes. Whether it’s helping a lawyer charging $1,000 per hour save ten hours, or helping a wealth advisor increase assets under management and optimize after-tax returns, the business value they create can be quantified directly. This precise binding to high-value production entities is the economic foundation that allows vertical AI to run.

Currently, this exploration is mainly unfolding along two directions.

First, using AI to recompose professional workflows, greatly compressing operational costs that previously only large institutions could shoulder. In the legal and wealth management tracks, high-barrier work such as compliance, risk control, and professional delivery is being systematically handled through technology platforms—allowing professionals to complete higher-density work with fewer resources.

Second, reconstructing transaction infrastructure, reshaping the connection between merchants and agents (Agents). In agentic commerce (Agentic Commerce), although the interception of intent and interaction at the front end are controlled by AI labs, the ultimate transaction conversion still happens in the merchant-side infrastructure. Adyen Agentic acts as a universal translator, helping merchants connect once and participate across various AI shopping platforms and protocols, without needing to rebuild systems for each new protocol.

The entry scenarios of the three cases differ, but all of them take core capabilities in the industry that were previously hard to standardize and, through systematized accumulation, turn them into assets that can be called sustainably. What Harvey accumulates is legal judgment and industry knowledge; what Farther accumulates is an advisor’s client relationships and tax optimization capabilities; what Adyen accumulates is merchants’ product data, protocol adaptation, and settlement capabilities.

This is what Microsoft CEO Satya Nadella calls “Token Capital”: the long-term value of AI does not come only from executing single tasks, but from structurally keeping human judgment, knowledge, and workflow structures inside the system—forming assets that can self-iterate through continuous interaction.

$190M ARR and $460M in compute fees: the scale game Harvey can’t sustain

Harvey is one of the highest-valued and fastest-growing examples in the current vertical AI wave, with the most concentrated illustration—both the potential and the dilemmas of this logic—within Harvey itself.

This legal platform, which does not own any general-purpose model, relies on deep customization of law firms’ core workflows. In just five months (from August 2025 to January 2026), it pulled ARR from $100 million to $190 million, reaching a valuation of $11 billion. This shows that vertical platforms don’t need to fight for supremacy in the underlying battle of general-purpose large models; as long as they truly understand industry tasks and rebuild the daily work scenarios of high-value users, they can establish strong commercialization capabilities.

But behind the impressive financial figures is an ever-expanding compute bill.

Public data shows that Harvey’s monthly token usage has grown from an initial roughly 1 trillion to 12–13 trillion. Estimated at $3 per million tokens, its annualized theoretical inference cost is as high as $468 million. Even if this cost is currently temporarily suppressed through large-firm discounts and techniques such as Prompt Caching, the fact that the cost structure is subject to external control means that once subsidies tighten, the bill will immediately rebound. Under this kind of financial pressure, ARR growth is extremely hard to translate into real cash flow, and instead it faces the constant risk of scale backlash.

Behind this is an unavoidable cost paradox for AI-native applications: the more popular the product becomes, the higher the inference costs. Traditional SaaS has marginal costs that are nearly zero, but in legal scenarios with long context and high inference density, every complex task consumes real compute. Therefore, building models in-house is no longer just a technical option—it becomes an inevitable choice driven by cost constraints.

At present, Harvey is pushing forward a post-training strategy for proprietary models, working deeply with Applied Compute to conduct specialized fine-tuning of open-source foundation models (such as GLM-5.1) for the legal industry. According to the latest technical disclosures from both sides, in Harvey’s self-built Legal Agent Benchmark (LAB), the rubric pass rate for the post-trained proprietary model improved from 0.853 to 0.913, surpassing GPT-5.5 xhigh and approaching Opus 4.8 Max.

Cost compression has also been significant. By replacing the evaluation model from frontier models to GPT-5 Mini, and batching multiple evaluation criteria for processing, evaluation costs were compressed by 40 to 100 times. This means Harvey can continue iterating the evaluation cycle at lower cost. The private evaluation system itself has already become an accumulative competitive asset.

Even more noteworthy is the change happening behind the performance improvements. Output completeness, numeric accuracy, document provenance, and hallucination suppression—all of these key behaviors show measurable gains. During training, the number of times the model calls tools has been decreasing, but each call is more precise, and total token consumption has fallen accordingly. In other words, what the model has learned is how to work effectively in a specific tool environment; and this behavior pattern accumulated through a large number of legal tasks is harder to replicate externally than the model’s parameters themselves.

Harvey’s case shows that the competitive foundation of vertical AI platforms is extending deeper. Workflow design and customer relationships are certainly important, but the post-training capabilities and control over open-source models, proprietary evaluation and data-generation capabilities, multi-agent architectures, and inference-cost optimization are becoming new sources of differentiation.

Farther’s “de-organization”: breaking the bond between advisors and traditional large brokerages

If Harvey compresses delivery costs within large professional service institutions, the wealth management platform Farther demonstrates how to help core talent escape the gravitational pull of traditional giants.

Farther is a technology platform for independent RIAs, specifically recruiting wealth advisors who have left giants such as Morgan Stanley, Merrill Lynch, UBS (UBS), and Goldman Sachs. Under the traditional full-service brokerage system, advisors often bear low commission splits and heavy back-office administrative burdens. Farther’s approach is to directly recruit advisors into the platform: the back-office capabilities that were previously monopolized by large institutions are integrated into a single unified platform. In addition to higher commission splits, it embeds tax-loss harvesting, direct indexing, access to private markets, compliance reviews, and document management. Official data shows that even relying solely on its tax-intelligence algorithms can improve clients’ after-tax investment returns by 1% to 3%.

This model has already received strong validation from the capital markets. In May 2026, Farther completed a $150 million Series D round led by General Atlantic, officially joining the unicorn club. Currently, its assets under management have exceeded $23 billion, including a star private banking team recently poached from Goldman Sachs’ private wealth division, managing $1.5 billion in assets. The steady influx of independent wealth advisors indicates that the system lock-in that traditional large brokerages depended on is failing; choosing to practice independently without institutional backing is no longer a fringe option for just a few people.

Harvey focuses on improving professional delivery efficiency within law firms; Farther, by contrast, builds an independent platform from scratch, allowing advisors to obtain back-office capabilities that are equal to or even stronger than those offered by traditional large brokerages without having to rely on them. Their entry points differ, but both are redefining how professional services are produced. Supported by this platform, complex investment tools—such as direct indexing and access to private markets—that were previously limited to the ultra-high-net-worth (UHNW) departments of large institutions can now be deployed easily by independent advisors, greatly expanding the business scope of individual professionals.

Traditional SaaS can only handle shallow process automation such as record-keeping and storage; it cannot share the burden of complex execution such as decision-making and coordination. An AI-native system based on a multi-agent architecture naturally fits the ambiguous gray zone between administrative execution and non-standard logic judgment—such as compliance review, personalized document drafting, and asset allocation recommendations. These business tasks that previously required an entire back-office team to collaborate on are now being absorbed quickly by the system.

The underestimated merchant side: the transaction closed loop of Agentic Commerce

Discussions about agentic commerce are still running hot, but public attention is currently almost monopolized by the consumer side—how AI assistants replace users in searching for products, comparing prices, and automatically placing orders. By contrast, the actual feedback from the merchant side is far more sober.

Walmart’s conversion rate on its AI-native Instant Checkout is currently only one-third of the traditional click-through model; and the proportion of merchants that have truly fully integrated Shopify’s AI checkout system in 2026 is still limited. There is a clear gap between AI-activated demand and the completion of actual transactions.

The gap exists because agent transactions are a system engineering challenge. Understanding user intent is only the first step; to convert demand into revenue, you also need full-chain support including inventory verification, tax calculation, anti-fraud risk control, fulfillment, and funds settlement—and these capabilities are still tightly locked in merchants’ local systems. At the same time, multiple agent payment protocols such as UCP, ACP, AP2, Agent Pay, and Visa Tokenization coexist but are not compatible with each other. Merchants have neither the motivation to adapt to each individually nor the ability to bear the cost of technical fragmentation.

Adyen therefore launched Adyen Agentic. Through three layers of modular APIs, it covers different parts of the transaction chain:

Agentic Feed: standardizes and distributes merchants’ product catalogs, pricing, and real-time inventory data to mainstream AI platforms;
Agentic Cart: connects merchants’ existing checkout, tax, fulfillment, and order management systems to the conversational commerce infrastructure;
Agentic Payments: handles identity verification, network risk control, and multi-channel funds settlement in agent-led transactions.

With a single integration, merchants can have Adyen translate across different AI agent platforms and protocols—without needing to rebuild their underlying systems whenever the market landscape changes.

In the agentic commerce ecosystem, the front-end AI labs and conversational interfaces may intercept user intent and traffic, but substantive value conversion, transaction finalization, and the capital closed loop still rely heavily on the merchant-side infrastructure. Compared with the fiercely competitive front-end entry points, the merchant-side systematized integration service is more likely to solidify into stable, chargeable underlying infrastructure.

Hidden risks for vertical platforms: model-makers’ penetration and token-cost restructuring

As the market becomes saturated with low-priced general tools, the business logic of large-model platforms maintained through low-unit-price subscriptions is gradually showing fragility. When generalized functions—such as summarizing web pages and drafting emails—can be easily replaced, vertical platforms must consolidate around high-value customers who care more about business outcomes. But the further you go into high-value industries, the more complex the competitive environment facing the application layer becomes.

One source of pressure comes from model-makers proactively expanding their business boundaries. OpenAI and Anthropic are no longer content to just act as API wholesalers; instead, they enter clients’ core sites through front-line engineering teams (FDE). In April 2026, OpenAI partnered with Customers Bank with $26 billion in assets under management, embedding engineering teams into the bank to develop intelligent agents for loan approvals and account opening using local data. Anthropic partnered with financial IT giant FIS, embedding its FDE team into its internal systems to develop anti-money-laundering tools, and by leveraging FIS’s service channels to reach many banks, it directly touches the deepest layers of banking operations.

This on-site collaboration model shows that large model makers are using infrastructure channels to learn and replicate internal business processes in high-barrier industries directly.

Another pressure comes from the unsustainable token pricing logic. Currently, most frontier foundation models are sold at essentially loss-making prices after subsidies. With frequent calls in enterprise multi-agent architectures, once the big makers’ subsidies recede, vertical platforms that rely entirely on external frontier API interfaces will no longer be able to sustain their compute bills.

As inference demand rises, this pressure will be further amplified. When hundreds of around-the-clock agents in the background interact at high frequency, compute demand grows exponentially. Yet constrained by physical limitations such as the extremely long manufacturing cycles of hardware like ASML lithography machines, the underlying hardware supply chain cannot quickly keep up. For the vast majority of daily business operations, using frontier large models to process all tasks in itself is a severe mismatch of resources.

This is precisely why Harvey must cooperate with Applied Compute: to build specialized test sets, proprietary evaluation systems, and human annotation pipelines. Vertical platforms are not just building products; they are conducting high-difficulty cost engineering. This means precisely calculating token consumption for each task, clarifying which intermediate steps can be offloaded to low-cost open-source small models, which critical decisions must call expensive flagship models, and where human review should be inserted.

Against this backdrop, it is already hard for a single layer of polished workflow interfaces to provide enduring competitive advantages. Pushing backend cost engineering to the extreme is the key proposition for vertical AI platforms to maintain long-term survival.

Conclusion: market scarcity returns to the very top of the industrial chain

When general-purpose large models become as readily available as water, electricity, and gas, the value of the AI application layer begins to concentrate at the very top and very bottom of the industrial chain.

At this stage, the scarcity attributes of the industry have not disappeared: the very top remains core that cannot be standardized by algorithms—customer trust, complex non-standard judgments, and unstructured knowledge embedded in practitioners’ experience; the very bottom is the merchant network, carrying product data, compliance pathways, and payment channels. The real significance of vertical platforms is to convert the professional experience of these high-value entities into sustainable Token Capital.

This also determines that the competition logic of the application layer is returning to pragmatism. The “scale narrative” that powered the software industry’s rapid sprint over the past decade begins to fail under the rigid constraints of compute expenses and physical supply-chain limits.

In the new cycle, the survival of application-layer companies depends on refined cost and compute arbitrage. As the model price war fades and computing resources become constrained, application platforms must find the optimal solution between cost and performance, rather than continuing to rely on capital injections.

Although large model makers have more compute resources and front-line engineering teams, for agile vertical platforms and independent professionals, the most unique competitive advantage still lies in converting accumulated professional tacit experience into system assets that foundation model manufacturers cannot replicate. Avoiding broad traffic competition and prioritizing the business closed loop of high-value production entities—that is the logic by which vertical AI can survive long-term in the era of commoditized large models.

View Original

How can the AI application layer find sustainable monetization space?

$190M ARR and $460M in compute fees: the scale game Harvey can’t sustain

Farther’s “de-organization”: breaking the bond between advisors and traditional large brokerages

The underestimated merchant side: the transaction closed loop of Agentic Commerce

Hidden risks for vertical platforms: model-makers’ penetration and token-cost restructuring

Conclusion: market scarcity returns to the very top of the industrial chain

Trending Topics

Get2SharesOfSKHynixAtZeroCost

MicronOvertakesMetaInMarketValue

WorldCup🇨🇴vs🇵🇹

USMayPCEInflationRisesTo4.1%HighestIn3Years

StakeUSD1Earn9.48%APR

Pinned