Goldman Sachs In-Depth Report: The Coming Turning Point Decoding the AI Agent Economy

robot
Abstract generation in progress

Agentic AI is shifting the artificial intelligence industry from a cost narrative to a profit narrative. Goldman Sachs believes that as token consumption is about to experience a leap in growth, and the decline in underlying computing costs has outpaced the decrease in token pricing, the profit margin inflection point for large-scale cloud providers and large model providers may occur within the next 3 to 12 months.

According to Chasing Wind Trading Platform, Goldman Sachs released a report on May 5 stating that the bank expects that by 2030, combined consumer and enterprise AI agents will drive global token consumption to increase 24 times compared to 2026 levels, reaching approximately 120 quintillion tokens per month; if calculated based on peak adoption of enterprise agents by 2040, this figure will further expand to 55 times.

Meanwhile, Goldman Sachs’s inferred price and cost curves show that mainstream large model token prices have stabilized or even slightly rebounded after a previous annual decline of about 40%, while chip-driven per-token compute costs from Nvidia, AMD, Google TPU, and Trainium continue to decline at a rate of 60% to 70% annually, creating a widening gap that opens profit margins for the industry. Large-scale capital expenditure on AI infrastructure may become more sustainable due to improved profit margins.

Token Economics Inflection Point: Costs Decline Faster Than Prices, Profit Margins Are Opening

The core argument of Goldman Sachs’s report is that the AI industry is transitioning from a phase where “uncertain inference economics may dilute profits” to a new phase where “incremental tokens generate attractive marginal profits.”

In the first phase of the AI cycle, investors generally viewed compute and tokens as cost drivers—more usage meant more inference load, more accelerators, more electricity, and higher capital expenditure. But Goldman Sachs’s inferred price and cost curves indicate this logic is changing.

Although mainstream large model token prices have fallen significantly, they have now stabilized or even rebounded in some cases; meanwhile, the full cost per token for Nvidia, Google TPU (Broadcom), AMD, and Trainium (Marvell) continues to decline rapidly and persistently. If token prices remain stable above token costs, then increasing adoption of agentic AI will lead to positive profit expansion rather than just revenue growth.

Goldman Sachs further points out that agentic AI could form a self-reinforcing economic flywheel: lower compute costs per token lead to richer, more complex agents; these agents, with longer context windows, more loops, more validation, and continuous monitoring, consume more tokens; higher utilization improves the economics of AI infrastructure, supporting providers to continuously invest in model quality and distribution capabilities. Goldman Sachs believes this flywheel is fundamentally different from the mainstream narrative that “AI usage will incur unsustainable cost burdens.”

However, Goldman Sachs also warns of risks: not all AI workloads can guarantee a positive profit inflection point. For commoditized pure text chatbots, competition may still force token prices to decline faster than compute costs.

Consumer-Side Agents: From Fragmented Conversations to “Resident” Assistants, Token Consumption to Increase 12x

Goldman Sachs estimates that by 2030, consumer AI agents will increase global token consumption 12 times, adding about 60 quintillion tokens per month.

The report divides consumer agents into two categories: one is “on-demand” agents, such as OpenAI Operator, Claude Code, and browser-based agents, where users initiate tasks that the agent autonomously plans, executes, and returns results; the other is “resident” agents, such as continuous background email monitoring, scheduling, or digital life assistants. Goldman Sachs believes the largest surge in token consumption will occur when agents shift from user-initiated tasks to continuous background operation—monitoring context and acting proactively when needed.

Based on simulated data, a typical LLM chatbot consumes about 1,000 tokens per session, embedded Copilot consumes over 5,000 tokens daily, while resident agents can consume over 100,000 tokens per day.

Goldman Sachs projects that by 2030, daily AI query volume will increase from about 5 billion in 2025 to approximately 23 billion, with up to 30% flowing into agents in search, shopping, travel, email, and personal productivity domains. Meanwhile, the share of traditional search engines in query volume is expected to decline from 68% in 2025 to 36% in 2030, while native LLM applications will grow from 12% to 31%.

Enterprise-Side Agents: Workflow Complexity Drives Token Intensity, Consumption May Reach 55x by 2040

Goldman Sachs expects enterprise AI agents to become the largest token multipliers, driving global token consumption to increase 24 times by 2030, and further up to 55 times at peak adoption in 2040, with enterprise workloads accounting for over 70% of total global token usage by then.

Enterprise agents are more token-intensive than consumer agents because their workflows require more complex and precise operations—monitoring tasks, retrieving context, reasoning anomalies, validating outputs, updating systems, and continuously reporting issues throughout the workday. Additionally, enterprise agents often involve heavier multimodal inputs (voice, images, documents, screen activity, application data, logs, and structured system records), significantly increasing token intensity.

Goldman Sachs modeled token consumption for different professions using simulated agents.

Results show that programming agents consume about 7 million tokens daily, with API costs around $13 per day, far below human costs, explaining why software development has the fastest adoption rate; call center agents consume about 2 million tokens daily, but real-time speech processing could raise costs to $92 per day, making full voice automation economically uncompetitive; data entry agents consume about 25 million tokens daily, costing around $60 per day, still below human costs.

Goldman Sachs notes that the adoption speed of enterprise agents will depend on four variables: token volume, API costs, modality mix, and implementation complexity. Workflows primarily text-based with mature tool ecosystems will scale first; workflows centered on voice or deeply integrated with backend systems may progress more slowly.

From the adoption curve perspective, Goldman Sachs believes enterprise agentic AI will most likely follow an S-curve, with a peak adoption rate of about 35% to 40% among knowledge workers, reaching peak in approximately 15 years—faster than the median of 29 years for historical technology diffusion.

Sustainable Capital Expenditure: Profit Improvements Offer Greater Room for Large Cloud Providers

A key investment conclusion from Goldman Sachs’s report is that improved profit margins for hyperscale cloud providers will make current high infrastructure investments more sustainable, alleviating core concerns about AI capital expenditure returns.

The report notes that operators are still supply-constrained in meeting current and future compute demands; Google and Meta have already raised their 2026 capex forecasts, and Amazon’s management reiterated a strategy of maintaining high capital spending after Q1 earnings. Goldman Sachs expects that as profit inflection points approach, investors will increasingly seek evidence of visible returns.

Regarding specific targets, Goldman Sachs’s core logic for Amazon is based on AWS revenue growth resuming acceleration (Q1 YoY growth of 28%) and a backlog of $364 billion; for Google, it’s based on its cloud business’s 63% YoY growth in Q1 and backlog nearly doubling to about $460 billion; for Meta, it’s based on its ad business’s significantly outpacing the digital ad industry’s overall growth and the continued contribution of AI compute to user engagement and ad monetization.

In software, Goldman Sachs believes lower token costs make it easier for software vendors to embed agents into existing products without significantly impacting gross margins, while supporting pricing models based on outcomes, productivity, or work units rather than seat counts, expanding the addressable software market. For IT services firms, as agents shift AI consumption from standalone tools to enterprise-level, highly integrated workflows, demand for integration, governance, and orchestration will rise sharply, with Accenture viewed as a major beneficiary of this trend.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin