Accelerators — GPUs & ASICs
The chips that actually do the math. The most concentrated profit pool in the stack.
What this layer does
An AI accelerator is a chip optimized for the matrix multiplications at the core of modern neural networks. Three architectural camps compete: merchant GPUs (Nvidia and AMD, sold to anyone), hyperscaler custom ASICs (Google TPU, AWS Trainium, Meta MTIA, Microsoft Maia — designed in partnership with Broadcom or Marvell, manufactured at TSMC), and startup accelerators (Groq, Cerebras, SambaNova, Tenstorrent, Etched — usually inference-optimized).
This is the layer that captured roughly half of all AI infrastructure spending in 2024-2025. Nvidia’s data center business alone runs at >$100B annualized with ~75% gross margins. The interesting long-cycle question is whether custom ASICs and inference-specialized silicon erode that share — and how much.
Sub-categories
General-purpose AI accelerators sold to everyone. The Nvidia layer plus AMD challenging at the margin.
In-house silicon designed to fit a single hyperscaler’s workload, software stack, and economics. Manufactured with merchant partners.
The fabless design houses that co-design ASICs with hyperscalers. The fastest-growing slice of merchant semis.
Custom silicon optimized for fast, cheap inference. Sold either as cloud services (Groq, Cerebras) or chips/systems.
Whole-system challengers to Nvidia at training scale. Mostly wafer-scale or multi-chip.
Accelerators that run inside phones, laptops, cars, cameras. Doesn’t consume data-center capex, but is a parallel TAM.
Long-shot architectures using light or analog electronics for matrix multiplication. Mostly research stage with first products shipping now.