Overview / Tier II — Compute as a Service / Layer 04: AI Cloud & Inference

Sub-category 4.1

Hyperscaler AI Services

The big three plus Oracle. Where most enterprise AI deployments end up because the data is already there.

TL;DR

The four U.S. hyperscalers — Microsoft, Amazon, Google, and Oracle — collect roughly two-thirds of all enterprise AI workloads because they own the data those workloads run on. AI services revenue is growing 30–50% YoY but capex is growing faster, compressing free cash flow through 2026. MSFT (OpenAI partnership + M365 Copilot distribution) and AMZN (Anthropic anchor + Trainium) are the cleanest direct exposures. GOOG offers the best vertical integration if its TPU roadmap delivers. ORCL is the high-beta wild card with $450B+ in contracted RPO but heavy concentration risk.

What this category is

Specifically: managed AI services sold by the major public clouds — model APIs (Azure OpenAI Service, AWS Bedrock, Google Vertex AI, OCI Generative AI), the integrated tooling around them (SageMaker, Vertex Pipelines, Azure ML), and the underlying GPU/ASIC capacity packaged as a service.

This is a different layer from the raw GPU rental market (Layer 4.2 — neoclouds), the inference specialists (Layer 4.3), or the underlying silicon (Layer 8). The hyperscaler offering is the integrated stack: identity, data, security, compliance, model catalog, and bill-by-token, all under one MSA.

The structural thesis

Enterprise AI runs where enterprise data lives. The cost (and security exposure) of moving petabytes of customer data, transaction logs, support tickets, or contracts out of an existing AWS/Azure/GCP tenant to a third-party AI cloud is high enough that, in practice, almost no large company does it. Inference attaches to the data warehouse.

This is the structural moat. Even if a startup has the better model or the cheaper tokens, the hyperscaler wins the deployed workload because the data has gravity and the procurement contract is already signed.

The implication for investment: the hyperscalers are not selling a model — they're selling the bundle (compute + data + identity + governance + billing). The model is one component of the bundle, and as long as multiple competitive models are available through it (Bedrock has 6+ providers; Azure has Claude + OpenAI + Llama + Mistral; Vertex has Gemini + Claude + Llama), the hyperscaler captures the spend regardless of which model wins.

The four players

Microsoft Azure / Azure OpenAI Service (MSFT)

The category leader by AI revenue. Three distinct streams stack on top of each other:

Azure OpenAI Service — pay-per-token API access to GPT-class models, primarily routed through Azure
M365 Copilot — $30/user/month embedded assistant across Office, distributed into ~400M paid seats
Underlying Azure compute — third-party model labs and enterprises running their own AI on raw Azure capacity

Azure overall growth has been ~30–33% YoY, with AI contributing ~14–16 points of that growth in recent quarters — meaning AI-attached revenue is growing 100%+. FY26 capex is guided to ~$80B+, the highest in Microsoft's history.

The OpenAI partnership is the load-bearing pillar. The relationship is being renegotiated as OpenAI restructures its for-profit conversion and the Stargate JV (with Oracle, SoftBank, MGX) shifts some workloads to OCI. Even with carve-outs, MSFT remains OpenAI's primary cloud and largest equity holder.

BullM365 Copilot attach + Azure data gravity + ongoing OpenAI relationship + Maia ASIC roadmap defending margins.

BearOpenAI relationship erodes; M365 Copilot adoption disappoints (early data is mixed); $80B+ capex compresses FCF through FY27.

AWS Bedrock + SageMaker (AMZN)

The agnostic model platform. AWS hasn't bet on a single lab the way MSFT bet on OpenAI. Bedrock hosts 6+ model providers (Anthropic primary, Meta Llama, Mistral, Cohere, AI21, Amazon Nova, plus an OpenAI gpt-oss endpoint).

Anthropic is the de-facto anchor: AWS has invested $8B+ for cloud commitment, Anthropic runs primarily on Trainium, and "Project Rainier" is building a 400K+ Trainium cluster for Anthropic in 2025–2026.

AWS overall growth has re-accelerated to ~17–20% YoY (from low teens) as AI workloads ramp. Capex was $105B+ in 2025 and projected higher in 2026.

The Trainium 2/3 strategy is the most interesting silicon play among the hyperscalers — if AWS lands Trainium training workloads at scale (Anthropic + others), it captures both the cloud margin AND the silicon margin currently going to Nvidia.

BullLargest cloud + Anthropic anchor + Trainium silicon flywheel + agnostic model platform appeals to enterprises wary of single-vendor lock-in.

BearReactive vs MSFT/GOOG on frontier model partnerships; Anthropic could pull workloads to GCP/TPU; Trainium adoption stalls outside of Anthropic.

Google Cloud Vertex AI (GOOG)

The vertically integrated player. Google owns the model (Gemini), the silicon (TPU), the platform (Vertex), and the consumer distribution (Search, Workspace, YouTube). No other competitor has this stack.

Cloud growth: ~30–35% YoY. FY26 capex: $75–85B. The TPU is the differentiator — TPU v6e Trillium is shipping at scale; v7 Ironwood is announced. If TPUs are 2–3× more cost-efficient than Nvidia GPUs for Google's own and partners' workloads (which Google claims), the gross margin gap to MSFT/AMZN should widen as Vertex scales.

The catch: Google Cloud has the smallest enterprise sales motion of the three. Vertex AI is best-in-class for engineering teams that want it, but loses deals to MSFT and AWS on procurement and incumbency.

BullBest per-token economics if TPU works at scale; vertical integration (Gemini + TPU + Vertex + Workspace + Search) is unmatched; Anthropic on Vertex/TPU is a strategic counterweight to MSFT/OpenAI.

BearSales execution gap; Gemini perceived as behind GPT/Claude on frontier reasoning (rightly or not); smaller enterprise data footprint than AWS/Azure.

Oracle Cloud Infrastructure (OCI) (ORCL)

The wild card. OCI was a sub-scale also-ran cloud through 2023. In 2024–2025 it became the AI capex story of the year as RPO ballooned from ~$90B to over $455B, driven by signed contracts with OpenAI (Stargate JV), xAI, Meta, and others.

The differentiation is technical: OCI's networking architecture (RDMA-over-Converged-Ethernet, large reserved blocks, lower price-per-GPU-hour at scale) is genuinely competitive for very large training clusters. The Stargate JV with OpenAI/SoftBank/MGX is a $500B+ multi-year buildout for OpenAI's compute, with OCI as the operator.

Capex is exploding correspondingly: from ~$7B to a $25B+ run-rate. The bear case writes itself: massive customer concentration (OpenAI is the bulk of Stargate-related RPO), execution risk on the buildout, and uncertain take-rate economics on what is essentially a real-estate-and-power play.

BullMulti-year contracted backlog visibility (RPO grew 400%+); cheapest-at-scale GPU networking; Stargate is real and operational.

BearCustomer concentration on OpenAI; capex burn rate ahead of recognized revenue; OCI doesn't have the bundle the others have (no M365, no native enterprise sales motion); RPO ≠ recognized revenue.

Side-by-side

	MSFT	AMZN	GOOG	ORCL
Cloud growth (recent)	~30–33%	~17–20%	~30–35%	~50%+ (small base)
FY26 capex	~$80B+	~$110B+	~$75–85B	~$25B+
Anchor model	OpenAI	Anthropic	Gemini (own)	OpenAI (Stargate)
Custom silicon	Maia	Trainium 2/3	TPU v6/v7	None
Enterprise muscle	Strongest (M365)	Strong	Weakest of three	Database installed base
Biggest risk	OpenAI relationship	Reactive vs frontier	Sales execution	Customer concentration

Numbers are approximate / directional. Pull current 10-Q for exact figures.

The lower tier (China + IBM)

IBM Cloud / watsonx (IBM) — Enterprise-focused, pitched on governance and compliance. Real revenue, but unclear whether watsonx is winning new AI workloads vs. retaining legacy IBM accounts.
Alibaba Cloud / Qwen (BABA) — China's leading AI cloud, with the Qwen open-weight family as a competitive draw. Geopolitically isolated from western enterprises.
Tencent Cloud (700.HK) and Baidu AI Cloud (BIDU) — China-focused; same isolation dynamic.

These are not direct substitutes for AWS/Azure/GCP for western enterprise workloads. They show up in portfolios as China exposure, not as part of the hyperscaler comp set.

Unit economics

Hyperscaler AI services have three margin layers stacked on top:

Compute margin — selling GPU/TPU time. Mid-50s gross margin pre-AI; closer to 30–40% on AI workloads currently due to expensive Nvidia silicon. Improves with custom ASIC mix.
API service margin — token-priced model APIs. Higher than raw compute (the lab gets a cut, the cloud gets a cut). Likely 50–70% gross margin currently.
Bundle margin — M365 Copilot / Workspace AI / Office 365 attach. ~80%+ gross margin (software-like) but ramping slowly.

The big unknown is operating margin on AI as a whole. AWS operating margin has held in the high 30s; Azure is similar. Both are facing capex intensity outpacing revenue growth in 2025–2026. Free cash flow at MSFT/GOOG/AMZN is under structural pressure for the next 18–24 months while the buildout continues.

Bull view: gross margins recover by 2027 as utilization improves and custom silicon takes share. Bear view: ROIC stays depressed for a multi-year window because the unit economics of inference are worse than the cloud business it's displacing.

Bull / Bear

Bull

Inference grows 10× as agents go to production. Agentic workflows consume 100–1000× more tokens per task than chat. Even 20% adoption across knowledge work expands inference TAM faster than capex.
Custom silicon protects margins. Maia, Trainium, TPU all reduce Nvidia-margin pass-through. By 2027, hyperscaler-designed silicon could be 40–60% of training and inference within their own clouds.
Data gravity is a permanent moat. Migrating petabytes of customer data away from AWS/Azure/GCP is prohibitive. Whoever has the data wins the inference workload.

Bear

Open weights commoditize the API. If DeepSeek-class open models can do 90% of what GPT-5/Claude can at near-zero inference cost, hyperscaler API margins collapse to compute pass-through.
Capex outpaces ROI for years. $300B+ aggregate capex/year against single-digit billions of incremental AI revenue. FCF drag persists through 2027+, weighing on multiples.
The Nvidia tax stays high. If custom silicon disappoints on yields, software, or model architectures favoring CUDA, hyperscalers keep paying 75% gross margins to NVDA and never recapture them.

What to watch (KPIs)

Quarterly:

Azure / AWS / Google Cloud growth rate — and the AI-attached portion when disclosed
Capex guidance — magnitude and forward 12-month trajectory
RPO growth (especially Oracle)
Custom silicon mix disclosures — rare today, but watch for them. Trainium adoption is the cleanest one to track.
M365 Copilot seat penetration — when MSFT discloses (most quarters they don't)
AWS operating margin trajectory — leading indicator for AI cost absorption

Annual:

AI-specific revenue disclosures (currently buried; pressure to break out is rising)
Capex-to-revenue ratio
Free cash flow yield (trending down across all three; bottoms out when?)

Players

Players: Microsoft Azure / Azure OpenAI MSFT, AWS Bedrock + SageMaker AMZN, Google Cloud Vertex AI GOOG, Oracle Cloud Infrastructure (OCI) ORCL, IBM Cloud / watsonx IBM, Alibaba Cloud BABA, Tencent Cloud 700.HK, Baidu AI Cloud BIDU