How cheap can reasoning get without killing quality?

Gemma 3 27B (free) at $0.00/M is the cheapest in our ranking. DeepSeek V3.2, Qwen3.5, and GLM-4.6 all deliver frontier-adjacent reasoning for a fraction of Claude Opus or o3 prices.

Why do reasoning tokens cost more than regular output?

Many reasoning models bill visible output tokens at one rate and "reasoning" (thinking) tokens at a premium. A GPQA problem may burn 20K to 100K thinking tokens per solve. Always check the reasoning token price line when you budget.

Should I use a reasoning model for every task?

No. Chain-of-thought blows up latency and cost. Use a regular chat model for extraction, formatting, summarization. Reserve reasoning models for multi-step math, science benchmarks, research, planning, and code that requires proofs.

How do I compare reasoning costs fairly?

Hold prompt length constant and measure tokens per solve on a fixed GPQA subset. Cheap models often need 2x to 3x the thinking tokens, which can wipe out the per-token savings.

Use case · Reasoning

Cheapest reasoning LLMs

The cheapest models that hold up on GPQA, AIME, MATH, MMLU, HLE. Ranked by price per 1M input tokens.

Models30

Cheapest$0.00

ScopeGPQA · AIME · MATH

All pricing Pricing home

What this page is

This page ranks every model with credible reasoning scores (GPQA, AIME, MATH, MMLU, HLE, DROP, BBH) by input price. Reasoning models burn a lot of thinking tokens, so the headline input price is only part of the bill. The cheap end is dominated by open-source reasoners like DeepSeek, Qwen3, and GLM. Premium o-series and Claude Opus sit at the top of the price scale. Pair with our cost calculator to model real workloads.

Ranked by input price

Models with credible reasoning scores, cheapest first.

#	Model	Provider	In $/1M	Out $/1M	Context	avg score	Type
1	Gemma 3 27B (free)	Google DeepMind	$0.00	$0.00	131K	42.2	OSS
2	gpt-oss-120b (free)	OpenAI	$0.00	$0.00	131K	68.7	OSS
3	gpt-oss-20b (free)	OpenAI	$0.00	$0.00	131K	66.4	OSS
4	Llama 3.2 3B Instruct (free)	Meta	$0.00	$0.00	131K	8.7	OSS
5	Llama 3.3 70B Instruct (free)	Meta	$0.00	$0.00	66K	29.1	OSS
6	Llama 3.1 8B Instruct	Meta	$0.02	$0.05	16K	27.4	OSS
7	Mistral Nemo	Mistral AI	$0.02	$0.04	131K	37.2	OSS
8	Llama 3.2 1B Instruct	Meta	$0.03	$0.20	60K	14.5	OSS
9	Gemma 2 9B	Google DeepMind	$0.03	$0.09	8K	36.0	OSS
10	gpt-oss-20b	OpenAI	$0.03	$0.14	131K	67.4	OSS
11	Llama 3 8B Instruct	Meta	$0.03	$0.04	8K	30.8	OSS
12	Qwen2.5 Coder 7B Instruct	Alibaba Qwen	$0.03	$0.09	33K	44.4	OSS
13	gpt-oss-120b	OpenAI	$0.04	$0.19	131K	46.9	OSS
14	Qwen2.5 7B Instruct	Alibaba Qwen	$0.04	$0.10	33K	35.2	OSS
15	GPT-5 Nano	OpenAI	$0.05	$0.40	400K	45.3	Closed
16	Qwen3 8B	Alibaba Qwen	$0.05	$0.40	41K	56.5	OSS
17	Llama 3.2 3B Instruct	Meta	$0.05	$0.34	80K	24.2	OSS
18	Phi 4	Microsoft	$0.07	$0.14	16K	43.2	OSS
19	ERNIE 4.5 21B A3B Thinking	baidu	$0.07	$0.28	131K	57.0	OSS
20	Qwen3 235B A22B Instruct 2507	Alibaba Qwen	$0.07	$0.10	262K	48.5	OSS
21	Gemini 2.0 Flash Lite	Google DeepMind	$0.07	$0.30	1.0M	64.2	Closed
22	Gemma 3 27B	Google DeepMind	$0.08	$0.16	131K	42.2	OSS
23	Llama 4 Scout	Meta	$0.08	$0.30	328K	18.9	OSS
24	Qwen3 30B A3B Thinking 2507	Alibaba Qwen	$0.08	$0.40	131K	66.2	OSS
25	Qwen3 32B	Alibaba Qwen	$0.08	$0.24	41K	58.2	OSS
26	MiMo-V2-Flash	xiaomi	$0.09	$0.29	262K	73.3	OSS
27	Qwen3 30B A3B Instruct 2507	Alibaba Qwen	$0.09	$0.30	262K	55.3	OSS
28	Qwen3 Next 80B A3B Instruct	Alibaba Qwen	$0.09	$1.10	262K	54.4	OSS
29	Qwen3 Next 80B A3B Thinking	Alibaba Qwen	$0.10	$0.78	131K	61.6	OSS
30	Gemini 2.0 Flash	Google DeepMind	$0.10	$0.40	1.0M	48.0	Closed

Top 3 cheapest reasoning LLMs

Gemma 3 27B (free) clears our reasoning filter (GPQA, AIME, MATH) at the lowest input price. Strong choice for bulk analytical work.

gpt-oss-120b (free) matches Gemma 3 27B (free) on reasoning benchmarks while keeping prices low. Good vendor diversification pick.

gpt-oss-20b (free) matches Gemma 3 27B (free) on reasoning benchmarks while keeping prices low. Good vendor diversification pick.

The price gap · cheapest vs most expensive

Cheapest

Gemma 3 27B (free)

$0.00/M

$ per 1M input tokens

Why the gap

Premium reasoners pay for longer thinking budgets, better tool use, and vendor reliability. For many tasks, Gemma 3 27B (free) closes 70 to 90 percent of the GPQA gap at a fraction of the cost.

Most expensive

Gemini 2.0 Flash

$0.10/M

$ per 1M input tokens

Frequently asked questions

Models with explicit reasoning scores on GPQA Diamond, AIME 2024/2025, MATH-500, MMLU-Pro, HLE, DROP, BBH, or ARC-AGI. Reasoning models typically use extended chain-of-thought and burn more tokens on hard problems.

Cheapest reasoning LLMs

Ranked by input price

Top 3 cheapest reasoning LLMs

The price gap · cheapest vs most expensive

Frequently asked questions

See also

Related pricing

Stacks

Compare