Which models have reliable 200K context?

Claude Sonnet, Claude Opus, GPT-5, Gemini 2.5 Pro, Qwen3.5, DeepSeek V3, GLM-4.6, Mistral Large. Most frontier models shipped 200K+ as table stakes by 2026.

Does context length affect output quality?

Yes. Models generally perform best in the first 32K to 64K tokens. Needle-in-haystack accuracy drops past 150K for many models. Test on your specific retrieval patterns.

Should I pay more for 200K over 128K?

Only if your documents truly exceed 128K. Otherwise, pick from /pricing/128k-context where the cheapest option is lower and quality may be higher at shorter inputs.

How much does a full 200K-token prompt cost?

Roughly $-200000.00 on the cheap end to $0.02 on the premium end per input request.

Context · 200K+

Cheapest 200K context LLMs

Every LLM with a 200,000+ token context window. Ranked by input price per 1M tokens.

Models40

Cheapest$-1000000.00

Min context200K tokens

All pricing Pricing home

What this page is

200K context is the modern default for frontier models. This page lists every priced 200K+ model, cheapest first. For long docs that fit in 200K, this tier offers the best balance of price, recall, and model diversity.

Ranked by input price

200K+ context models, cheapest first.

#	Model	Provider	In $/1M	Out $/1M	Context	context	Type
1	Auto Router	openrouter	$-1000000.00	$-1000000.00	2.0M	2.0M	Closed
2	Pareto Code Router	openrouter	$-1000000.00	$-1000000.00	200K	200K	Closed
3	Elephant	openrouter	$0.00	$0.00	262K	262K	Closed
4	Free Models Router	openrouter	$0.00	$0.00	200K	200K	Closed
5	Gemma 4 26B A4B (free)	Google DeepMind	$0.00	$0.00	262K	262K	OSS
6	Gemma 4 31B (free)	Google DeepMind	$0.00	$0.00	262K	262K	OSS
7	Hy3 preview (free)	tencent	$0.00	$0.00	262K	262K	Closed
8	Lyria 3 Clip Preview	Google DeepMind	$0.00	$0.00	1.0M	1.0M	Closed
9	Lyria 3 Pro Preview	Google DeepMind	$0.00	$0.00	1.0M	1.0M	Closed
10	Nemotron 3 Nano 30B A3B (free)	NVIDIA	$0.00	$0.00	256K	256K	OSS
11	Nemotron 3 Nano Omni (free)	NVIDIA	$0.00	$0.00	256K	256K	Closed
12	Nemotron 3 Super (free)	NVIDIA	$0.00	$0.00	262K	262K	OSS
13	Owl Alpha	openrouter	$0.00	$0.00	1.0M	1.0M	Closed
14	Qwen3 Coder 480B A35B (free)	Alibaba Qwen	$0.00	$0.00	262K	262K	OSS
15	Qwen3 Next 80B A3B Instruct (free)	Alibaba Qwen	$0.00	$0.00	262K	262K	OSS
16	Qwen3.6 Plus (free)	Alibaba Qwen	$0.00	$0.00	1.0M	1.0M	Closed
17	Qwen3.6 Plus Preview (free)	Alibaba Qwen	$0.00	$0.00	1.0M	1.0M	OSS
18	Step 3.5 Flash (free)	stepfun	$0.00	$0.00	256K	256K	OSS
19	GPT-5 Nano	OpenAI	$0.05	$0.40	400K	400K	Closed
20	Nemotron 3 Nano 30B A3B	NVIDIA	$0.05	$0.20	262K	262K	OSS
21	Gemma 4 26B A4B	Google DeepMind	$0.06	$0.33	262K	262K	OSS
22	GLM 4.7 Flash	z-ai	$0.06	$0.40	203K	203K	OSS
23	Nova Lite 1.0	Amazon	$0.06	$0.24	300K	300K	Closed
24	Qwen3.5-Flash	Alibaba Qwen	$0.07	$0.26	1.0M	1.0M	OSS
25	Qwen3 235B A22B Instruct 2507	Alibaba Qwen	$0.07	$0.10	262K	262K	OSS
26	Gemini 2.0 Flash Lite	Google DeepMind	$0.07	$0.30	1.0M	1.0M	Closed
27	Seed 1.6 Flash	ByteDance	$0.07	$0.30	262K	262K	Closed
28	Llama 4 Scout	Meta	$0.08	$0.30	328K	328K	OSS
29	MiMo-V2-Flash	xiaomi	$0.09	$0.29	262K	262K	OSS
30	Nemotron 3 Super	NVIDIA	$0.09	$0.45	262K	262K	OSS
31	Qwen3 30B A3B Instruct 2507	Alibaba Qwen	$0.09	$0.30	262K	262K	OSS
32	Qwen3 Next 80B A3B Instruct	Alibaba Qwen	$0.09	$1.10	262K	262K	OSS
33	Gemini 2.0 Flash	Google DeepMind	$0.10	$0.40	1.0M	1.0M	Closed
34	Gemini 2.5 Flash Lite	Google DeepMind	$0.10	$0.40	1.0M	1.0M	Closed
35	Gemini 2.5 Flash Lite Preview 09-2025	Google DeepMind	$0.10	$0.40	1.0M	1.0M	Closed
36	GPT-4.1 Nano	OpenAI	$0.10	$0.40	1.0M	1.0M	Closed
37	Qwen3.5-9B	Alibaba Qwen	$0.10	$0.15	262K	262K	OSS
38	Seed-2.0-Mini	ByteDance	$0.10	$0.40	262K	262K	Closed
39	Step 3.5 Flash	stepfun	$0.10	$0.30	262K	262K	OSS
40	Qwen3 Coder Next	Alibaba Qwen	$0.12	$0.80	262K	262K	OSS

Top 3 cheapest 200K context LLMs

Auto Router delivers 2.0M context at $-1000000.00/M input. Great sweet spot for long docs without paying 1M context premium.

Pareto Code Router delivers 200K context at $-1000000.00/M input. Great sweet spot for long docs without paying 1M context premium.

Elephant delivers 262K context at $0.00/M input. Great sweet spot for long docs without paying 1M context premium.

The price gap · cheapest vs most expensive

Cheapest

Auto Router

$-1000000.00/M

$ per 1M input tokens

Why the gap

At the 200K tier, premium pricing buys better tail-end retrieval and higher overall reasoning. For retrieval-heavy RAG, cheap models often match premium ones.

Most expensive

Qwen3 Coder Next

$0.12/M

$ per 1M input tokens

Frequently asked questions

Yes. 200K handles a full novel, a medium-sized codebase, or 100+ pages of PDFs. Only use 1M when you truly need whole-repo or multi-book context.

Cheapest 200K context LLMs

Ranked by input price

Top 3 cheapest 200K context LLMs

The price gap · cheapest vs most expensive

Frequently asked questions

See also

Other context tiers

Stacks

Compare