Beta
Context · 32K+

Cheapest 32K context LLMs

Every LLM with at least 32K token context. Ranked by input price per 1M tokens.

Models50
Cheapest$-1000000.00
Min context32K tokens
What this page is
32K context is the floor for modern LLM use. This tier captures the broadest set of priced models at one of the deepest discounts. Ideal for chat, short-doc RAG, classification, and any workload where you do not need a huge window.

32K+ context models, cheapest first.

#ModelIn $/1MOut $/1MType
1openrouter logoAuto Router$-1000000.00$-1000000.00Closed
2openrouter logoBody Builder (beta)$-1000000.00$-1000000.00Closed
3openrouter logoElephant$0.00$0.00Closed
4openrouter logoFree Models Router$0.00$0.00Closed
5Google DeepMind logoGemma 3 12B (free)$0.00$0.00OSS
6Google DeepMind logoGemma 3 27B (free)$0.00$0.00OSS
7Google DeepMind logoGemma 3 4B (free)$0.00$0.00OSS
8Google DeepMind logoGemma 4 26B A4B (free)$0.00$0.00OSS
9Google DeepMind logoGemma 4 31B (free)$0.00$0.00OSS
10z-ai logoGLM 4.5 Air (free)$0.00$0.00OSS
11OpenAI logogpt-oss-120b (free)$0.00$0.00OSS
12OpenAI logogpt-oss-20b (free)$0.00$0.00OSS
13nousresearch logoHermes 3 405B Instruct (free)$0.00$0.00OSS
14liquid logoLFM2.5-1.2B-Instruct (free)$0.00$0.00OSS
15liquid logoLFM2.5-1.2B-Thinking (free)$0.00$0.00OSS
16Meta logoLlama 3.2 3B Instruct (free)$0.00$0.00OSS
17Meta logoLlama 3.3 70B Instruct (free)$0.00$0.00OSS
18Google DeepMind logoLyria 3 Clip Preview$0.00$0.00Closed
19Google DeepMind logoLyria 3 Pro Preview$0.00$0.00Closed
20minimax logoMiniMax M2.5 (free)$0.00$0.00OSS
21Mistral AI logoMistral Small 3.1 24B (free)$0.00$0.00OSS
22NVIDIA logoNemotron 3 Nano 30B A3B (free)$0.00$0.00OSS
23NVIDIA logoNemotron 3 Super (free)$0.00$0.00OSS
24NVIDIA logoNemotron Nano 12B 2 VL (free)$0.00$0.00OSS
25NVIDIA logoNemotron Nano 9B V2 (free)$0.00$0.00OSS
26Alibaba Qwen logoQwen3 4B (free)$0.00$0.00OSS
27Alibaba Qwen logoQwen3 Coder 480B A35B (free)$0.00$0.00OSS
28Alibaba Qwen logoQwen3 Next 80B A3B Instruct (free)$0.00$0.00OSS
29Alibaba Qwen logoQwen3.6 Plus (free)$0.00$0.00Closed
30Alibaba Qwen logoQwen3.6 Plus Preview (free)$0.00$0.00OSS
31stepfun logoStep 3.5 Flash (free)$0.00$0.00OSS
32arcee-ai logoTrinity Large Preview (free)$0.00$0.00OSS
33arcee-ai logoTrinity Mini (free)$0.00$0.00OSS
34cognitivecomputations logoUncensored (free)$0.00$0.00OSS
35liquid logoLFM2-2.6B$0.01$0.02OSS
36liquid logoLFM2-8B-A1B$0.01$0.02OSS
37ibm-granite logoGranite 4.0 Micro$0.02$0.11OSS
38Google DeepMind logoGemma 3n 4B$0.02$0.04OSS
39Mistral AI logoMistral Nemo$0.02$0.04OSS
40Meta logoLlama 3.2 1B Instruct$0.03$0.20OSS
41OpenAI logogpt-oss-20b$0.03$0.14OSS
42liquid logoLFM2-24B-A2B$0.03$0.12OSS
43Alibaba Qwen logoQwen2.5 Coder 7B Instruct$0.03$0.09OSS
44Alibaba Qwen logoQwen-Turbo$0.03$0.13OSS
45Amazon logoNova Micro 1.0$0.04$0.14Closed
46Cohere logoCommand R7B (12-2024)$0.04$0.15Closed
47OpenAI logogpt-oss-120b$0.04$0.19OSS
48Google DeepMind logoGemma 3 12B$0.04$0.13OSS
49Google DeepMind logoGemma 3 4B$0.04$0.08OSS
50NVIDIA logoNemotron Nano 9B V2$0.04$0.16OSS
Cheapest
Auto Router
$-1000000.00/M
$ per 1M input tokens
Why the gap

At this tier, price reflects raw model quality more than context size. The cheapest 32K model is often a small open-source model; the most expensive is usually a frontier model deliberately billed the same across windows.

Most expensive
Gemma 3 12B
$0.04/M
$ per 1M input tokens
For most applications, yes. Chat histories, RAG retrievals, and single-document QA rarely exceed 16K tokens. 32K leaves plenty of headroom.