Training is a fixed cost: you train a model once (or periodically retrain) using massive GPU clusters. The economics favor raw compute power — whoever has the most FLOPs wins. This is NVIDIA's domain: A100, H100, B200 are all optimized for training throughput.
Inference is a variable cost: every user query, every API call, every agent action requires inference compute. As AI adoption scales, inference volume grows exponentially while training stays relatively flat. The economics shift from "maximum compute" to "minimum cost per query" and "maximum throughput per watt."
This distinction matters enormously for investors. In a training-dominated world, you buy NVIDIA and NVIDIA's supply chain. In an inference-dominated world, the competitive landscape fragments and the value chain shifts.
Custom ASICs gain share because hyperscalers (Google, Amazon, Microsoft) can design chips optimized for their specific inference workloads at lower cost per query than general-purpose GPUs. Google's TPU v5, AWS Trainium2, and Meta's MTIA are all inference-focused.
NVIDIA adapts by releasing inference-optimized configurations (L40S, H100 NVL) and pushing software moats (TensorRT-LLM, Triton). NVIDIA won't lose inference entirely — but their market share will be lower than in training.
Edge inference becomes relevant as models shrink enough to run on-device. Qualcomm, MediaTek, and Apple's custom silicon benefit from running AI locally rather than in the cloud.
Closelook tracks the training-to-inference ratio through the Compute layer of the Functional Index. As inference dominates, the weight of custom ASIC and inference-focused companies increases relative to pure GPU plays. The index adapts to reflect this structural shift.
The inference shift is one of the most important structural themes in AI investing. It doesn't mean NVIDIA loses — it means NVIDIA's dominance becomes less absolute. Portfolio implications: diversify compute exposure beyond pure NVIDIA, monitor custom ASIC adoption rates, and watch inference cost curves.
Functional Index — Compute Layer →AI Chip Buildout Dossier →6-Layer Model →