Are sub-$1 models production-ready?

Yes. They power most high-volume LLM traffic in 2026 (content generation, classification, extraction). For latency-critical consumer apps and complex reasoning, pay more.

How do I pick the best sub-$1 model?

Start with the highest-ranked model in the table above (by benchmark score within budget). Then test on your actual prompts. Quality varies a lot within the under-$1 tier.

Can I run a chatbot on a sub-$1 model?

Yes. Most consumer chatbot traffic runs on small cheap models. Only escalate to premium when the conversation clearly requires it (complex code, long reasoning). See the chatbot stack .

Output prices still matter, right?

Yes. Many sub-$1 input models bill output at 3 to 5 times the input rate. Always check the output column. For chatty workloads, output cost dominates.

Budget · Under $1/M

LLMs under $1 per million tokens

Every model priced below $1 per 1M input tokens. Ranked by benchmark score within budget.

Models40

Top quality78.4

Max price$1.00/M

All pricing Pricing home

What this page is

This page lists every model with input priced below $1 per million tokens, ranked by benchmark quality. This is the volume tier · most production traffic for chat, classification, extraction, and bulk content runs on sub-$1 models. The quality variance within this budget is huge, so the ranking matters.

Ranked by quality within budget

Input under $1/M, highest benchmark score first.

#	Model	Provider	In $/1M	Out $/1M	Context	avg score	Type
1	Qwen3.5 397B A17B	Alibaba Qwen	$0.39	$2.34	262K	78.4	OSS
2	DeepSeek V3.2 Speciale	DeepSeek	$0.40	$1.20	164K	78.2	OSS
3	Step 3.5 Flash	stepfun	$0.10	$0.30	262K	76.9	OSS
4	MiMo-V2-Flash	xiaomi	$0.09	$0.29	262K	73.3	OSS
5	Qwen3.6 Plus	Alibaba Qwen	$0.33	$1.95	1.0M	70.9	OSS
6	Palmyra X5	writer	$0.60	$6.00	1.0M	69.7	Closed
7	MiniMax M2	minimax	$0.26	$1.00	197K	69.5	OSS
8	GLM 4.5	z-ai	$0.60	$2.20	131K	69.2	OSS
9	gpt-oss-120b (free)	OpenAI	$0.00	$0.00	131K	68.7	OSS
10	gpt-oss-20b	OpenAI	$0.03	$0.14	131K	67.4	OSS
11	gpt-oss-20b (free)	OpenAI	$0.00	$0.00	131K	66.4	OSS
12	Qwen3 30B A3B Thinking 2507	Alibaba Qwen	$0.08	$0.40	131K	66.2	OSS
13	Grok 3 Mini Beta	xAI	$0.30	$0.50	131K	64.8	Closed
14	Gemini 2.0 Flash Lite	Google DeepMind	$0.07	$0.30	1.0M	64.2	Closed
15	MiniMax M2.7	minimax	$0.30	$1.20	197K	63.5	OSS
16	Gemma 4 31B	Google DeepMind	$0.13	$0.38	262K	61.6	OSS
17	Qwen3 Next 80B A3B Thinking	Alibaba Qwen	$0.10	$0.78	131K	61.6	OSS
18	GPT-5.1-Codex-Mini	OpenAI	$0.25	$2.00	400K	60.4	Closed
19	LongCat Flash Chat	meituan	$0.20	$0.80	131K	59.8	OSS
20	Gemini 2.5 Flash Lite	Google DeepMind	$0.10	$0.40	1.0M	59.1	Closed
21	DeepSeek V3	DeepSeek	$0.32	$0.89	164K	59.0	OSS
22	Qwen3 Max	Alibaba Qwen	$0.78	$3.90	262K	58.3	OSS
23	Qwen3 32B	Alibaba Qwen	$0.08	$0.24	41K	58.2	OSS
24	R1 0528	DeepSeek	$0.50	$2.15	164K	57.9	OSS
25	Mixtral 8x7B Instruct	Mistral AI	$0.54	$0.54	33K	57.8	OSS
26	GLM 5	z-ai	$0.60	$1.92	203K	57.6	OSS
27	ERNIE 4.5 21B A3B Thinking	baidu	$0.07	$0.28	131K	57.0	OSS
28	Qwen3 8B	Alibaba Qwen	$0.05	$0.40	41K	56.5	OSS
29	Qwen3 235B A22B	Alibaba Qwen	$0.46	$1.82	131K	56.4	OSS
30	Kimi K2 0711	moonshotai	$0.57	$2.30	131K	56.2	OSS
31	GPT-5 Mini	OpenAI	$0.25	$2.00	400K	56.0	Closed
32	Qwen3 235B A22B Thinking 2507	Alibaba Qwen	$0.15	$1.50	131K	55.9	OSS
33	Mistral Small 3.1 24B	Mistral AI	$0.35	$0.56	128K	55.8	OSS
34	Qwen3 30B A3B Instruct 2507	Alibaba Qwen	$0.09	$0.30	262K	55.3	OSS
35	DeepSeek V3 0324	DeepSeek	$0.20	$0.77	164K	55.1	OSS
36	MiniMax M2.5	minimax	$0.15	$1.15	197K	55.1	OSS
37	Qwen3 Next 80B A3B Instruct	Alibaba Qwen	$0.09	$1.10	262K	54.4	OSS
38	Kimi K2 Thinking	moonshotai	$0.60	$2.50	262K	53.3	OSS
39	DeepSeek V3.2 Exp	DeepSeek	$0.27	$0.41	164K	53.2	OSS
40	Qwen2.5 72B Instruct	Alibaba Qwen	$0.36	$0.40	33K	53.2	OSS

Top 3 best-quality sub-$1 models

Best quality under $1

Qwen3.5 397B A17B is the highest-scoring model with input priced below $1/M. Excellent for bulk, high-volume workloads.

Runner up

DeepSeek V3.2 Speciale

DeepSeek V3.2 Speciale is the highest-scoring model with input priced below $1/M. Excellent for bulk, high-volume workloads.

Step 3.5 Flash is the highest-scoring model with input priced below $1/M. Excellent for bulk, high-volume workloads.

The price gap · cheapest vs most expensive

Cheapest

Qwen3.5 397B A17B

$0.39/M

$ per 1M input tokens

Why the gap

At the sub-$1 tier, the "most expensive" is still cheap. The tradeoff is raw benchmark quality vs provider reliability. Open-source models dominate the cheap end.

Most expensive

Qwen3 Max

$0.78/M

$ per 1M input tokens

Frequently asked questions

Most open-source models (DeepSeek V3, Qwen3.5, GLM-4.6, Llama 3.3) plus small proprietary models (Gemini Flash, GPT-4o-mini, Claude Haiku) on various providers. The cheap end clusters around $0.05 to $0.30 per million.

LLMs under $1 per million tokens

Ranked by quality within budget

Top 3 best-quality sub-$1 models

The price gap · cheapest vs most expensive

Frequently asked questions

See also

Other budget tiers

Use cases

Compare