Context · 32K+

Cheapest 32K context LLMs

Every LLM with at least 32K token context. Ranked by input price per 1M tokens.

Models50
Cheapest$-1000000.00
Min context32K tokens
What this page is
32K context is the floor for modern LLM use. This tier captures the broadest set of priced models at one of the deepest discounts. Ideal for chat, short-doc RAG, classification, and any workload where you do not need a huge window.

32K+ context models, cheapest first.

#ModelIn $/1MOut $/1MType
1openrouter logoAuto Router$-1000000.00$-1000000.00Closed
2openrouter logoBody Builder (beta)$-1000000.00$-1000000.00Closed
3openrouter logoPareto Code Router$-1000000.00$-1000000.00Closed
4openrouter logoElephant$0.00$0.00Closed
5openrouter logoFree Models Router$0.00$0.00Closed
6Google DeepMind logoGemma 3 12B (free)$0.00$0.00OSS
7Google DeepMind logoGemma 3 27B (free)$0.00$0.00OSS
8Google DeepMind logoGemma 3 4B (free)$0.00$0.00OSS
9Google DeepMind logoGemma 4 26B A4B (free)$0.00$0.00OSS
10Google DeepMind logoGemma 4 31B (free)$0.00$0.00OSS
11z-ai logoGLM 4.5 Air (free)$0.00$0.00OSS
12OpenAI logogpt-oss-120b (free)$0.00$0.00OSS
13OpenAI logogpt-oss-20b (free)$0.00$0.00OSS
14nousresearch logoHermes 3 405B Instruct (free)$0.00$0.00OSS
15tencent logoHy3 preview (free)$0.00$0.00Closed
16liquid logoLFM2.5-1.2B-Instruct (free)$0.00$0.00OSS
17liquid logoLFM2.5-1.2B-Thinking (free)$0.00$0.00OSS
18Meta logoLlama 3.2 3B Instruct (free)$0.00$0.00OSS
19Meta logoLlama 3.3 70B Instruct (free)$0.00$0.00OSS
20Meta logoLlama Guard 4 12B (free)$0.00$0.00Closed
21Google DeepMind logoLyria 3 Clip Preview$0.00$0.00Closed
22Google DeepMind logoLyria 3 Pro Preview$0.00$0.00Closed
23minimax logoMiniMax M2.5 (free)$0.00$0.00OSS
24Mistral AI logoMistral Small 3.1 24B (free)$0.00$0.00OSS
25NVIDIA logoNemotron 3 Nano 30B A3B (free)$0.00$0.00OSS
26NVIDIA logoNemotron 3 Nano Omni (free)$0.00$0.00Closed
27NVIDIA logoNemotron 3 Super (free)$0.00$0.00OSS
28NVIDIA logoNemotron Nano 12B 2 VL (free)$0.00$0.00OSS
29NVIDIA logoNemotron Nano 9B V2 (free)$0.00$0.00OSS
30openrouter logoOwl Alpha$0.00$0.00Closed
31baidu logoQianfan-OCR-Fast (free)$0.00$0.00Closed
32Alibaba Qwen logoQwen3 4B (free)$0.00$0.00OSS
33Alibaba Qwen logoQwen3 Coder 480B A35B (free)$0.00$0.00OSS
34Alibaba Qwen logoQwen3 Next 80B A3B Instruct (free)$0.00$0.00OSS
35Alibaba Qwen logoQwen3.6 Plus (free)$0.00$0.00Closed
36Alibaba Qwen logoQwen3.6 Plus Preview (free)$0.00$0.00OSS
37stepfun logoStep 3.5 Flash (free)$0.00$0.00OSS
38arcee-ai logoTrinity Large Preview (free)$0.00$0.00OSS
39arcee-ai logoTrinity Mini (free)$0.00$0.00OSS
40cognitivecomputations logoUncensored (free)$0.00$0.00OSS
41liquid logoLFM2-2.6B$0.01$0.02OSS
42liquid logoLFM2-8B-A1B$0.01$0.02OSS
43ibm-granite logoGranite 4.0 Micro$0.02$0.11OSS
44Mistral AI logoMistral Nemo$0.02$0.03OSS
45Meta logoLlama 3.2 1B Instruct$0.03$0.20OSS
46OpenAI logogpt-oss-20b$0.03$0.14OSS
47liquid logoLFM2-24B-A2B$0.03$0.12OSS
48Alibaba Qwen logoQwen2.5 Coder 7B Instruct$0.03$0.09OSS
49Alibaba Qwen logoQwen-Turbo$0.03$0.13OSS
50Amazon logoNova Micro 1.0$0.04$0.14Closed
Cheapest
Auto Router
$-1000000.00/M
$ per 1M input tokens
Why the gap

At this tier, price reflects raw model quality more than context size. The cheapest 32K model is often a small open-source model; the most expensive is usually a frontier model deliberately billed the same across windows.

Most expensive
Nova Micro 1.0
$0.04/M
$ per 1M input tokens
For most applications, yes. Chat histories, RAG retrievals, and single-document QA rarely exceed 16K tokens. 32K leaves plenty of headroom.