LIVETracking 971 AI models from 268 providers.

Models971·Providers268·Benchmarks128·Companies71·Agents165·TopQwen3 VL 235B A22B Instruct · 1415.8%·Updatedjust now·Data Points2,902·MCP Servers4,923

Cerebras CS-3

ApplianceShippingWSE-32024

Wafer-scale AI compute appliance. A single CS-3 contains one WSE-3 chip (the largest chip ever made, using an entire 300mm wafer). 44 GB of on-chip SRAM eliminates the HBM bottleneck entirely. No memory wall. Paired with MemoryX and SwarmX for disaggregated memory and networking. Fastest inference per chip in the world.

1

GPUs per system

All systems Compute hub

GPU count

1x

CPU model

Host server

1x

Memory type

SRAM (on-die)

Total HBM

0 TB

Host memory

1.5 TB

Interconnect

SwarmX

12 TB/s

Networking

1200 Gbps

Storage

MemoryX external

Form factor

15U appliance

Weight

250 kg

Rack units

15U

Performance

Manufacturer datasheet values · aggregate system compute

FP4 PFLOPS	TBD
FP8 PFLOPS	TBD
FP16 PFLOPS	125
BF16 PFLOPS	125
Training effective PFLOPS	90
Inference tokens/sec	1,800

Power and cooling

Thermal envelope · cooling requirements · efficiency

Rack power

23 kW

Per GPU

23000 W

Cooling

liquid

PUE estimate

1.1

Power draw relative to tracked systems23 kW / 2500 kW max

TCO analysis

Hardware amortized over 3 years · power at $0.05/kWh

List price

$3,500,000

Per GPU effective

$3,500,000

Cost per GPU per month

$97,222

Available from

Cerebras

Known deployments

Disclosed in press releases, SEC filings, and conference talks

Cerebras Inference

Quantity

Inference cloud

Source

Cerebras blog

Quantity

Medical AI research

Source

Cerebras press release

Sources

Every data point on this page is reproducible

Cerebras CS-3 product page

Other AI systems

Compare across the system landscape

8960x Google TPU v5p · 8100 PFLOPS

Pod / clusterShipping

DGX GB300 NVL72

72x NVIDIA B300 · 1440 PFLOPS

Full rackAnnounced

DGX GB200 NVL72

72x NVIDIA B200 · 720 PFLOPS

Full rackShipping

256x Google TPU v6e · 230 PFLOPS

Pod / clusterShipping

32x Microsoft Maia 100 · 96 PFLOPS

Full rackRamping

8x NVIDIA B200 · 80 PFLOPS

Server nodeShipping

Back to all systems Compute hub