Beta
Compute · SystemsRack-scale AI infrastructure

Every System · Tracked

Every rack-scale AI system from every manufacturer. GPU counts, FP8 PFLOPS, power draw, cooling type, TCO per PFLOPS, and known datacenter deployments · sourced from spec sheets, earnings calls, and disclosed infrastructure builds.

TCO trackedCooling profiledDeployments mappedPFLOPS per kW
TCO Index
Best TCO per PFLOPS per year
Lower is better. Hardware amortized over 3 years + power at $0.05/kWh. Average across all systems: $9,046/PFLOPS/yr.
TCO ranking · $/PFLOPS/year
Key insights
  • DGX GB300 NVL72 offers the lowest TCO at $973/PFLOPS/year
  • Liquid-cooled systems average 6% lower TCO than air-cooled
  • 4 of 8 ranked systems are NVIDIA-based
Full Compute Hub
10
Systems
9,433
Total GPUs
6
Manufacturers
914 TB
Total HBM
8100
Max PFLOPS
6
Liquid cooled
8
Shipping
4.81
Avg PFLOPS/kW

10 systems · sorted by FP8 PFLOPS · all manufacturers

SystemGPUsFP8 PFLOPS
Google logo
TPU v5p Pod
Google
8,9608,100
NVIDIA logo
DGX GB300 NVL72
NVIDIA
721,440
NVIDIA logo
DGX GB200 NVL72
NVIDIA
72720
Google logo
TPU v6e Pod
Google
256230
Microsoft logo
Maia 100 Rack
Microsoft
3296
NVIDIA logo
DGX B200
NVIDIA
880
AMD logo
MI325X Platform
AMD
848
AWS logo
Trn2 UltraServer
AWS
1648
NVIDIA logo
HGX H100
NVIDIA
832
Cerebras logo
CS-3
Cerebras
1TBD

$/PFLOPS/year · hardware amortized 3yr + power at $0.05/kWh · lower is better

3.NVIDIA logoDGX B200Air
8 GPUs$1,768
5.NVIDIA logoHGX H100Air
8 GPUs$3,306
6.Google logoTPU v6e PodLiquid
256 GPUs$14,734
7.Google logoTPU v5p PodLiquid
8960 GPUs$17,926
8.AWS logoTrn2 UltraServerAir
16 GPUs$30,131
Average TCO across all ranked systems
$9,046/PFLOPS/yr

Liquid vs air · power stats · efficiency comparison

Liquid cooled6 systems
Avg power
481 kW
Avg PFLOPS/kW
5.15
Air cooled4 systems
Avg power
12 kW
Avg PFLOPS/kW
4.38
Why cooling matters for TCO

Liquid cooling enables PUE of 1.05 to 1.15 vs 1.3 to 1.5 for air cooling. At datacenter scale, this translates to 15 to 30% lower power costs. Liquid-cooled systems also allow higher GPU density per rack, reducing interconnect latency and physical footprint. Every major new AI rack (DGX GB200, MI350X cluster) ships liquid-cooled by default.

19 disclosed deployments · who is building what

OperatorSystem
Google DeepMind logoGoogle DeepMind
Google logoTPU v5p Pod
Anthropic logoAnthropic
Google logoTPU v5p Pod
Microsoft Azure logoMicrosoft Azure
NVIDIA logoDGX GB200 NVL72
OC
Oracle Cloud
NVIDIA logoDGX GB200 NVL72
CoreWeave logoCoreWeave
NVIDIA logoDGX GB200 NVL72
xAI logoxAI
NVIDIA logoDGX GB200 NVL72
Google DeepMind logoGoogle DeepMind
Google logoTPU v6e Pod
Google Cloud logoGoogle Cloud
Google logoTPU v6e Pod
Microsoft Azure logoMicrosoft Azure
Microsoft logoMaia 100 Rack
Enterprise customers logoEnterprise customers
NVIDIA logoDGX B200
Microsoft Azure logoMicrosoft Azure
AMD logoMI325X Platform
Meta logoMeta
AMD logoMI325X Platform
Amazon AGI logoAmazon AGI
AWS logoTrn2 UltraServer
Anthropic logoAnthropic
AWS logoTrn2 UltraServer
Meta logoMeta
NVIDIA logoHGX H100
Microsoft Azure logoMicrosoft Azure
NVIDIA logoHGX H100
Google Cloud logoGoogle Cloud
NVIDIA logoHGX H100
Cerebras Inference logoCerebras Inference
Cerebras logoCS-3
Mayo Clinic logoMayo Clinic
Cerebras logoCS-3

10 systems · 6 manufacturers

Google logoTPU v5p PodShipping

Google's high-performance TPU pod for large-scale training. 8,960 TPU v5p chips in a single pod connected via ICI 3.0 fabric. Powers Gemini ...

GPUs
8960
FP8 PFLOPS
8100
Power
2500
kW
Pod / clusterLiquidGoogle TPU v5p
HBM: 860 TB·ICI 3.0
DeployedGoogle DeepMind logoGoogle DeepMindAnthropic logoAnthropic
NVIDIA logoDGX GB300 NVL72Announced

Next-gen liquid-cooled rack with Blackwell Ultra GPUs. 72x B300 GPUs with 288 GB HBM3e per GPU (vs 192 GB on B200). Designed for reasoning-h...

GPUs
72
FP8 PFLOPS
1440
Power
140
kW
Full rackLiquidNVIDIA B300
HBM: 20.7 TB·NVLink 5.0+
$4.0M list · $973/PFLOPS/yr
NVIDIA logoDGX GB200 NVL72Shipping

NVIDIA's flagship liquid-cooled rack. 72 Blackwell GPUs + 36 Grace CPUs connected via NVLink 5.0 in a single 72-GPU domain. Designed for tri...

GPUs
72
FP8 PFLOPS
720
Power
120
kW
Full rackLiquidNVIDIA B200
HBM: 13.8 TB·NVLink 5.0
$3.0M list · $1,469/PFLOPS/yr
DeployedMicrosoft Azure logoMicrosoft Azure
OC
Oracle Cloud
CoreWeave logoCoreWeave+1
Google logoTPU v6e PodShipping

Google's latest custom AI accelerator in pod configuration. 256 TPU v6e chips connected via custom ICI (Inter-Chip Interconnect). Optimized ...

GPUs
256
FP8 PFLOPS
230
Power
60
kW
Pod / clusterLiquidGoogle TPU v6e
HBM: 8 TB·ICI 4.0
DeployedGoogle DeepMind logoGoogle DeepMindGoogle Cloud logoGoogle Cloud
Microsoft logoMaia 100 RackRamping

Microsoft's first custom AI silicon at rack scale. Maia 100 chips fabricated at TSMC on N5 with HBM3e. Designed for Azure AI inference and f...

GPUs
32
FP8 PFLOPS
96
Power
40
kW
Full rackLiquidMicrosoft Maia 100
HBM: 6.1 TB·Custom Ethernet fabric
DeployedMicrosoft Azure logoMicrosoft Azure
NVIDIA logoDGX B200Shipping

8-GPU Blackwell node for enterprises that don't need the full NVL72 rack. Air-cooled with NVLink 4.0 interconnect. The workhorse for inferen...

GPUs
8
FP8 PFLOPS
80
Power
14.3
kW
Server nodeAirNVIDIA B200
HBM: 1.5 TB·NVLink 4.0
$0.4M list · $1,768/PFLOPS/yr
DeployedEnterprise customers logoEnterprise customers
AMD logoMI325X PlatformShipping

AMD's latest 8-GPU OAM platform with MI325X accelerators. 256 GB HBM3e per GPU for the largest memory footprint in its class. Infinity Fabri...

GPUs
8
FP8 PFLOPS
48
Power
10
kW
Server nodeAirAMD MI325X
HBM: 2 TB·Infinity Fabric
$0.3M list · $2,063/PFLOPS/yr
DeployedMicrosoft Azure logoMicrosoft AzureMeta logoMeta
AWS logoTrn2 UltraServerShipping

AWS custom silicon training server. 16x Trainium2 chips in a single UltraServer node connected via NeuronLink. Designed to compete with NVID...

GPUs
16
FP8 PFLOPS
48
Power
12
kW
Server nodeAirAWS Trainium2
HBM: 1.5 TB·NeuronLink
DeployedAmazon AGI logoAmazon AGIAnthropic logoAnthropic
NVIDIA logoHGX H100Shipping

The system that launched the AI infrastructure boom. 8x H100 SXM GPUs connected via NVLink 4.0. Still the most widely deployed AI training s...

GPUs
8
FP8 PFLOPS
32
Power
10.2
kW
Server nodeAirNVIDIA H100 SXM
HBM: 0.64 TB·NVLink 4.0
$0.3M list · $3,306/PFLOPS/yr
DeployedMeta logoMetaMicrosoft Azure logoMicrosoft AzureGoogle Cloud logoGoogle Cloud
Cerebras logoCS-3Shipping

Wafer-scale AI compute appliance. A single CS-3 contains one WSE-3 chip (the largest chip ever made, using an entire 300mm wafer). 44 GB of ...

GPUs
1
FP8 PFLOPS
TBD
Power
23
kW
ApplianceLiquidCerebras WSE-3
HBM: 0 TB·SwarmX
$3.5M list
DeployedCerebras Inference logoCerebras InferenceMayo Clinic logoMayo Clinic

Pulled from the live dataset · schema-ready for AEO

What is a DGX GB200 NVL72?

The DGX GB200 NVL72 is NVIDIA's flagship AI compute rack. It contains 72 Blackwell GPUs and 36 Grace CPUs in a single liquid-cooled rack, connected via NVLink 5.0. It delivers up to 720 PFLOPS of FP8 compute and requires 120+ kW of power. It's the system training the largest AI models at Microsoft, Oracle, CoreWeave, and xAI.

What does TCO per PFLOPS mean?

Total Cost of Ownership per PFLOPS per year normalizes the cost of different AI systems to a comparable metric. It includes hardware cost (amortized over 3 years) plus power costs (at $0.05/kWh average datacenter rate) divided by the system's FP8 compute output in PFLOPS. Lower is better. This is the metric datacenter operators optimize when choosing between NVIDIA, AMD, Google TPU, or custom silicon.

Why do some systems require liquid cooling?

Modern AI chips consume 700-1200W each. An NVL72 rack with 72 GPUs at 1000W each generates 72 kW of heat from GPUs alone. Air cooling cannot efficiently remove this much heat. Liquid cooling (direct-to-chip or immersion) is 10-100x more effective at heat transfer, enabling higher GPU density per rack, lower PUE (1.1 vs 1.3-1.5 for air), and higher chip performance due to better thermal management.

How does a TPU pod compare to an NVIDIA DGX rack?

Google TPU pods and NVIDIA DGX racks are fundamentally different architectures. A TPU v5p pod can contain 8,960 chips in a single interconnected domain via ICI fabric. NVIDIA's NVL72 is a 72-GPU rack-scale system. TPU pods offer massive parallelism for Google's own model architectures but are only available on Google Cloud. DGX systems are available from multiple OEMs for on-premise deployment.

What is the Cerebras CS-3 and why is it different?

The Cerebras CS-3 uses a single WSE-3 chip, the largest chip ever made, occupying an entire 300mm silicon wafer. Instead of using HBM stacks, it has 44 GB of on-chip SRAM for zero-latency memory access. This eliminates the memory bandwidth bottleneck that limits conventional GPU systems. The tradeoff is higher per-system cost and a different programming model.

Why are hyperscalers building custom silicon?

Google (TPU), AWS (Trainium), Microsoft (Maia), and Meta (MTIA) are all building custom AI chips to reduce dependency on NVIDIA, optimize for their specific workloads, and lower costs at scale. At hyperscaler volume (hundreds of thousands of chips), even a 10-20% efficiency gain over NVIDIA translates to billions in savings. Custom silicon also provides supply chain diversification.

Keep exploring the compute graph