Something quietly inverted in AI compute this year, and it changes what the buildout is actually for.


In 2023, 2/3 of AI compute went to training, the actual work of building a model. The other, smaller slice went to inference, the work of actually running it once it's built. But that ratio quietly started flipping.
Inference is now 2/3 and still climbing, per Deloitte, and the chips built to run it crossed $50B this year.
The main reason this flip matters (and it's not percentage-wise): training and inference are different animals. Training happens in bursts, on one giant cluster, then it's done. Inference never stops. It runs every time someone sends a prompt or an agent takes a step, and it scales with every user you add. One is a construction project. The other is a utility bill that grows forever.
Every assumption about AI infrastructure was built around training, because that's where the money went. The money just moved to the workload that doesn't need to sit in a single cluster to run.
post-image
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pinned