Meta will limit employee AI usage quotas: monitoring through self-built gateways, internal costs may reach billions by 2026

robot
Abstract generation in progress

According to Beating Monitoring, Meta Platforms plans to rein in the company’s escalating AI costs by limiting employees’ Token usage quotas. Based on a leaked internal memo, Meta is building a centralized gateway called AI Gateway to monitor employees’ AI usage and spend in real time, set budgets, and impose caps on Token consumption. Meta expects that, in 2026, internal AI usage alone will generate costs in the billions of dollars.

The cap measures sharply contrast with Meta’s previous push to promote AI. In November 2025, Meta notified employees that showcasing “AI-driven influence” would be a core evaluation criterion for 2026, and that performance bonuses would be tied to AI usage rates. Overpromotion sparked a “tokenmaxxing” frenzy among employees to race to rack up usage, and at one point even an internal leaderboard named “Claudeonomics” was created to publicly display usage rankings. Before the leaderboard was shut down, employees’ total Token consumption in a 30-day period had surged to 73.7 trillion. Meta Chief Technology Officer Andrew Bosworth later issued a warning, emphasizing that simply increasing Token consumption does not mean increased output, and that employees should use AI tools in scenarios where they can genuinely improve efficiency.

To further reduce spending, Meta has begun shifting the internal AI development focus toward self-developed tools. The leaked memo shows that Meta is pushing employees to gradually stop using third-party programming tools such as Anthropic’s Claude and instead use its self-developed coding assistant, MetaCode (formerly Devmate). The newly established AI application engineering department has been instructed to fully improve MetaCode by generating programming challenges to produce high-intensity reinforcement learning training data. Although Meta still allows employees to access external models, in the future it will implement stricter budget and quota-approval mechanisms within its self-built gateway.

Meta is not the only company facing financial pressure due to overload from large-model usage. In early 2026, companies such as Uber and ServiceNow exhausted their annual Anthropic quotas within just a few months. ServiceNow has implemented daily usage monitoring for employees, and some venture capital firms have also started setting daily average spending caps on internal AI Token usage to prevent unrestrained expansion of compute-cost growth.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pinned