Futures
Access hundreds of perpetual contracts
CFD
Gold
One platform for global traditional assets
Options
Hot
Trade European-style vanilla options
Unified Account
Maximize your capital efficiency
Demo Trading
Introduction to Futures Trading
Learn the basics of futures trading
Futures Events
Join events to earn rewards
Demo Trading
Use virtual funds to practice risk-free trading
Launch
CandyDrop
Collect candies to earn airdrops
Launchpool
Quick staking, earn potential new tokens
HODLer Airdrop
Hold GT and get massive airdrops for free
Pre-IPOs
Unlock full access to global stock IPOs
Alpha Points
Trade on-chain assets and earn airdrops
Futures Points
Earn futures points and claim airdrop rewards
Promotions
AI
Gate AI
Your all-in-one conversational AI partner
Gate AI Bot
Use Gate AI directly in your social App
GateClaw
Gate Blue Lobster, ready to go
Gate for AI Agent
AI infrastructure, Gate MCP, Skills, and CLI
Gate Skills Hub
10K+ Skills
From office tasks to trading, the all-in-one skill hub makes AI even more useful.
GateRouter
Smartly choose from 40+ AI models, with 0% extra fees
Goldman Sachs In-Depth Report: The Coming Turning Point Decoding the AI Agent Economy
Agentic AI is shifting the artificial intelligence industry from a cost narrative to a profit narrative. Goldman Sachs believes that as token consumption is about to experience a leap in growth, and the decline in underlying computing costs has outpaced the decrease in token pricing, the profit margin inflection point for large-scale cloud providers and large model providers may occur within the next 3 to 12 months.
According to Chasing Wind Trading Platform, Goldman Sachs released a report on May 5 stating that the bank expects that by 2030, combined consumer and enterprise AI agents will drive global token consumption to increase 24 times compared to 2026 levels, reaching approximately 120 quintillion tokens per month; if calculated based on peak adoption of enterprise agents by 2040, this figure will further expand to 55 times.
Meanwhile, Goldman Sachs’s inferred price and cost curves show that mainstream large model token prices have stabilized or even slightly rebounded after a previous annual decline of about 40%, while chip-driven per-token compute costs from Nvidia, AMD, Google TPU, and Trainium continue to decline at a rate of 60% to 70% annually, creating a widening gap that opens profit margins for the industry. Large-scale capital expenditure on AI infrastructure may become more sustainable due to improved profit margins.
Token Economics Inflection Point: Costs Decline Faster Than Prices, Profit Margins Are Opening
The core argument of Goldman Sachs’s report is that the AI industry is transitioning from a phase where “uncertain inference economics may dilute profits” to a new phase where “incremental tokens generate attractive marginal profits.”
In the first phase of the AI cycle, investors generally viewed compute and tokens as cost drivers—more usage meant more inference load, more accelerators, more electricity, and higher capital expenditure. But Goldman Sachs’s inferred price and cost curves indicate this logic is changing.
Although mainstream large model token prices have fallen significantly, they have now stabilized or even rebounded in some cases; meanwhile, the full cost per token for Nvidia, Google TPU (Broadcom), AMD, and Trainium (Marvell) continues to decline rapidly and persistently. If token prices remain stable above token costs, then increasing adoption of agentic AI will lead to positive profit expansion rather than just revenue growth.
Goldman Sachs further points out that agentic AI could form a self-reinforcing economic flywheel: lower compute costs per token lead to richer, more complex agents; these agents, with longer context windows, more loops, more validation, and continuous monitoring, consume more tokens; higher utilization improves the economics of AI infrastructure, supporting providers to continuously invest in model quality and distribution capabilities. Goldman Sachs believes this flywheel is fundamentally different from the mainstream narrative that “AI usage will incur unsustainable cost burdens.”
However, Goldman Sachs also warns of risks: not all AI workloads can guarantee a positive profit inflection point. For commoditized pure text chatbots, competition may still force token prices to decline faster than compute costs.
Consumer-Side Agents: From Fragmented Conversations to “Resident” Assistants, Token Consumption to Increase 12x
Goldman Sachs estimates that by 2030, consumer AI agents will increase global token consumption 12 times, adding about 60 quintillion tokens per month.
The report divides consumer agents into two categories: one is “on-demand” agents, such as OpenAI Operator, Claude Code, and browser-based agents, where users initiate tasks that the agent autonomously plans, executes, and returns results; the other is “resident” agents, such as continuous background email monitoring, scheduling, or digital life assistants. Goldman Sachs believes the largest surge in token consumption will occur when agents shift from user-initiated tasks to continuous background operation—monitoring context and acting proactively when needed.
Based on simulated data, a typical LLM chatbot consumes about 1,000 tokens per session, embedded Copilot consumes over 5,000 tokens daily, while resident agents can consume over 100,000 tokens per day.
Goldman Sachs projects that by 2030, daily AI query volume will increase from about 5 billion in 2025 to approximately 23 billion, with up to 30% flowing into agents in search, shopping, travel, email, and personal productivity domains. Meanwhile, the share of traditional search engines in query volume is expected to decline from 68% in 2025 to 36% in 2030, while native LLM applications will grow from 12% to 31%.
Enterprise-Side Agents: Workflow Complexity Drives Token Intensity, Consumption May Reach 55x by 2040
Goldman Sachs expects enterprise AI agents to become the largest token multipliers, driving global token consumption to increase 24 times by 2030, and further up to 55 times at peak adoption in 2040, with enterprise workloads accounting for over 70% of total global token usage by then.
Enterprise agents are more token-intensive than consumer agents because their workflows require more complex and precise operations—monitoring tasks, retrieving context, reasoning anomalies, validating outputs, updating systems, and continuously reporting issues throughout the workday. Additionally, enterprise agents often involve heavier multimodal inputs (voice, images, documents, screen activity, application data, logs, and structured system records), significantly increasing token intensity.
Goldman Sachs modeled token consumption for different professions using simulated agents.
Results show that programming agents consume about 7 million tokens daily, with API costs around $13 per day, far below human costs, explaining why software development has the fastest adoption rate; call center agents consume about 2 million tokens daily, but real-time speech processing could raise costs to $92 per day, making full voice automation economically uncompetitive; data entry agents consume about 25 million tokens daily, costing around $60 per day, still below human costs.
Goldman Sachs notes that the adoption speed of enterprise agents will depend on four variables: token volume, API costs, modality mix, and implementation complexity. Workflows primarily text-based with mature tool ecosystems will scale first; workflows centered on voice or deeply integrated with backend systems may progress more slowly.
From the adoption curve perspective, Goldman Sachs believes enterprise agentic AI will most likely follow an S-curve, with a peak adoption rate of about 35% to 40% among knowledge workers, reaching peak in approximately 15 years—faster than the median of 29 years for historical technology diffusion.
Sustainable Capital Expenditure: Profit Improvements Offer Greater Room for Large Cloud Providers
A key investment conclusion from Goldman Sachs’s report is that improved profit margins for hyperscale cloud providers will make current high infrastructure investments more sustainable, alleviating core concerns about AI capital expenditure returns.
The report notes that operators are still supply-constrained in meeting current and future compute demands; Google and Meta have already raised their 2026 capex forecasts, and Amazon’s management reiterated a strategy of maintaining high capital spending after Q1 earnings. Goldman Sachs expects that as profit inflection points approach, investors will increasingly seek evidence of visible returns.
Regarding specific targets, Goldman Sachs’s core logic for Amazon is based on AWS revenue growth resuming acceleration (Q1 YoY growth of 28%) and a backlog of $364 billion; for Google, it’s based on its cloud business’s 63% YoY growth in Q1 and backlog nearly doubling to about $460 billion; for Meta, it’s based on its ad business’s significantly outpacing the digital ad industry’s overall growth and the continued contribution of AI compute to user engagement and ad monetization.
In software, Goldman Sachs believes lower token costs make it easier for software vendors to embed agents into existing products without significantly impacting gross margins, while supporting pricing models based on outcomes, productivity, or work units rather than seat counts, expanding the addressable software market. For IT services firms, as agents shift AI consumption from standalone tools to enterprise-level, highly integrated workflows, demand for integration, governance, and orchestration will rise sharply, with Accenture viewed as a major beneficiary of this trend.