Why is the same model cheaper on some providers?

Hardware · Groq runs LPUs, Cerebras runs wafer-scale. Batch efficiency · Fireworks and Together have serverless optimization. Subsidies · some providers run at a loss for distribution. Region · US-only providers often undercut global ones.

Is the output quality the same across providers?

For open-weight models · yes, weights are identical. Differences come from quantization (some providers run int8 or fp8), context cap (some serve shorter than the model supports), and safety filtering layers added by the host.

Pricing · Arbitrage7 models · 34 provider listings · Up to 10x price spread

AI Arbitrage · Same Model, 10x Price Spread

Every open-weight model hosted on multiple providers. Same weights, different prices. Pick the cheapest with zero quality trade-off.

Llama 4

405B · Llama Community License

6 providers

Meta's flagship open-weight model · runs on six major inference providers with 10x price spread.

Cheapest

Groq

$0.030/M

Save vs most expensive

86%

$0.22 → $0.030

See all prices

Qwen3.5 397B-A17B

397B (17B active) · Apache 2.0

5 providers

Alibaba's flagship MoE model · 397B total / 17B active · multilingual leader.

Cheapest

Alibaba Cloud

$0.280/M

Save vs most expensive

53%

$0.60 → $0.280

See all prices

DeepSeek V3.2

671B (37B active) · MIT

5 providers

MoE reasoning model · rewrites cost structure of frontier-tier inference.

Cheapest

DeepSeek (direct)

$0.140/M

Save vs most expensive

81%

$0.75 → $0.140

See all prices

gpt-oss-20b

20B · Apache 2.0

5 providers

OpenAI's open-weight release · Apache 2.0 · self-hostable on consumer GPUs.

Cheapest

Cerebras

$0.040/M

Save vs most expensive

67%

$0.12 → $0.040

See all prices

gpt-oss-120b

120B · Apache 2.0

5 providers

OpenAI's larger open release · reasoning-capable · still cheap on optimized stacks.

Cheapest

Groq

$0.150/M

Save vs most expensive

63%

$0.40 → $0.150

See all prices

Mistral Large 3

123B · Mistral Research License

4 providers

Mistral's flagship · strong in code + multilingual · EU-hosted primary.

Cheapest

Groq

$1.80/M

Save vs most expensive

28%

$2.50 → $1.80

See all prices

Llama 4 Maverick

170B (17B active) · Llama Community License

4 providers

Meta's multimodal Llama 4 variant · vision + text · long context.

Cheapest

Fireworks AI

$0.180/M

Save vs most expensive

55%

$0.40 → $0.180

See all prices

Frequently Asked Questions

When the same open-weight model (Llama 4, DeepSeek V3.2, Qwen3.5, etc.) is hosted on multiple providers at different prices. Because the weights are identical, you can pick the cheapest with zero quality trade-off. Arbitrage spreads of 5-10x are common.

Open source

All OSS models Llama models Qwen models DeepSeek models

Explore

All pricing All providers Compare

AI Arbitrage · Same Model, 10x Price Spread

Llama 4

Qwen3.5 397B-A17B

DeepSeek V3.2

gpt-oss-20b

gpt-oss-120b

Mistral Large 3

Llama 4 Maverick

Frequently Asked Questions

Related pricing

Open source

Explore