AI Cloud & Inference

Where GPUs are rented by the hour and tokens are sold by the million.

What this layer does

This is the financial pivot point of the AI economy. Application revenue and model-API revenue land here as compute spend, and from here the money fans out into hardware, real estate, and power. Three hundred-plus billion dollars of annual hyperscaler capex flows through this layer.

The layer splits into training compute (large reserved clusters sold to a handful of labs) and inference compute (token-priced APIs and dedicated endpoints). Training is concentrated in a few buyers and a few clouds. Inference is fragmenting fast — specialists like Groq and Cerebras compete on tokens-per-second, while neoclouds undercut hyperscalers on price for less differentiated workloads.

Sub-categories

4.1

Hyperscaler AI Services

The big three plus Oracle. Where most enterprise AI deployments end up because the data is already there.

4.2

AI-Native Neoclouds

GPU-focused clouds built ground-up for AI workloads. The 2023-2026 capex story.

4.3

Inference Specialists

Custom-silicon or highly-optimized inference clouds competing on speed and cost-per-token.

4.4

Tier-2 / Regional GPU Clouds

Lower-cost or geographically-specific GPU rental, often via Hopper/Blackwell on a price-per-hour basis.

4.5

Sovereign & Regional AI Clouds

National-champion clouds. Politically driven capex, often state-backed.

4.6

Data & AI Platforms

Sitting on enterprise data — the obvious place to attach inference. Bundle: data warehouse + model serving + governance.

4.7

GPU Marketplaces & Brokers

Spot-market layer over the neoclouds and decentralized capacity.

Analysis coming soon — will cover: hyperscaler capex unit economics (rev/$capex by cohort), neocloud bear case (depreciation vs. GPU resale value, customer concentration on a single anchor), inference vs. training margin gap, and why some bitcoin miners successfully pivoted while others didn’t.