Context · 32K+
Cheapest 32K context LLMs
Every LLM with at least 32K token context. Ranked by input price per 1M tokens.
Models50
Cheapest$-1000000.00
Min context32K tokens
What this page is
32K context is the floor for modern LLM use. This tier captures the broadest set of priced models at one of the deepest discounts. Ideal for chat, short-doc RAG, classification, and any workload where you do not need a huge window.
Ranked by input price
32K+ context models, cheapest first.
Top 3 cheapest 32K+ context LLMs
Cheapest 32K
Auto Router
input
$-1000000.00/M
output
$-1000000.00/M
Auto Router at $-1000000.00/M input with 2.0M context · fine for most chat and RAG loads.
Runner up
Body Builder (beta)
input
$-1000000.00/M
output
$-1000000.00/M
Body Builder (beta) at $-1000000.00/M input with 128K context · fine for most chat and RAG loads.
Third
Elephant
input
$0.00/M
output
$0.00/M
Elephant at $0.00/M input with 262K context · fine for most chat and RAG loads.
The price gap · cheapest vs most expensive
Cheapest
Auto Router
$-1000000.00/M
$ per 1M input tokens
Why the gap
At this tier, price reflects raw model quality more than context size. The cheapest 32K model is often a small open-source model; the most expensive is usually a frontier model deliberately billed the same across windows.
Most expensive
Gemma 3 12B
$0.04/M
$ per 1M input tokens
Frequently asked questions
For most applications, yes. Chat histories, RAG retrievals, and single-document QA rarely exceed 16K tokens. 32K leaves plenty of headroom.