Arbitrage · gpt-oss-20b5 providers · $0.040 → $0.120 · 67% spread
Cheapest Provider for gpt-oss-20b
OpenAI's open-weight release · Apache 2.0 · self-hostable on consumer GPUs.
Cheapest input
Cerebras
$0.040/M
Wafer-scale chip · extreme speed
Fastest
Cerebras
1200 tok/s
Wafer-scale chip · extreme speed
Savings calculator
Save 67%
vs DeepInfra at $0.120/M input. For 100M tokens/mo, that is $8/mo saved by routing to Cerebras.
Sorted by input price
All 5 providers
| Provider | In $/M | Out $/M |
|---|---|---|
| $0.040 | $0.160 | |
| $0.050 | $0.100 | |
| $0.070 | $0.280 | |
| $0.100 | $0.100 | |
D DeepInfra | $0.120 | $0.240 |
Notes: Cerebras · Wafer-scale chip · extreme speed ; Groq · Fastest · LPU tokens/s leader ; Fireworks AI · Standard serverless ; Together AI · Flat output price ; DeepInfra · No free tier · fair price
Frequently Asked Questions
Cerebras at $0.040/M input and $0.160/M output. That is 67% cheaper than DeepInfra. Wafer-scale chip · extreme speed.