What is the cheapest vision LLM in 2026?

Free Models Router at $0.00 per 1M input tokens is the cheapest vision model in our dataset. Note: for very large batches of images, per-image pricing quirks can flip the ranking.

Which vision model is best for OCR?

Claude Sonnet and Gemini 2.5 Pro lead on handwriting and dense documents. For simple printed-page OCR, Qwen VL and InternVL match premium models at a fraction of the price. Test on your actual documents.

Can I use a text-only LLM by sending base64 images?

No. Vision requires a multimodal model with an image encoder. If the model is not tagged multimodal here, do not send it images. Use a separate OCR service upstream.

How do I cut vision costs on a real app?

Downscale images to the minimum resolution that preserves signal, cache frequently-analyzed images via prompt caching, and batch requests. Switching from high to low detail on OpenAI cuts cost by roughly 4x per image.

Use case · Vision

Cheapest vision LLMs

The cheapest multimodal models that accept images. Ranked by input token price, with notes on per-image billing and OCR quality.

Models30

Cheapest$0.00

TypeMultimodal

All pricing Pricing home

What this page is

This page lists every multimodal model (vision capable) with priced API access, ranked cheapest first. Note that per-image billing can diverge from per-token billing: a single high-res image may cost hundreds of tokens depending on the provider. Use the input price below as a baseline, then check the model detail page for image-specific pricing notes. Ideal for OCR, document processing, chart reading, and visual QA.

Ranked by input price

Multimodal models only, cheapest first.

#	Model	Provider	In $/1M	Out $/1M	Context	avg score	Type
1	Free Models Router	openrouter	$0.00	$0.00	200K	0.0	Closed
2	Gemma 3 12B (free)	Google DeepMind	$0.00	$0.00	33K	0.0	OSS
3	Gemma 3 27B (free)	Google DeepMind	$0.00	$0.00	131K	42.2	OSS
4	Gemma 3 4B (free)	Google DeepMind	$0.00	$0.00	33K	0.0	OSS
5	Gemma 4 26B A4B (free)	Google DeepMind	$0.00	$0.00	262K	0.0	OSS
6	Gemma 4 31B (free)	Google DeepMind	$0.00	$0.00	262K	0.0	OSS
7	Llama Guard 4 12B (free)	Meta	$0.00	$0.00	164K	0.0	Closed
8	Mistral Small 3.1 24B (free)	Mistral AI	$0.00	$0.00	128K	0.0	OSS
9	Nemotron 3 Nano Omni (free)	NVIDIA	$0.00	$0.00	256K	0.0	Closed
10	Nemotron Nano 12B 2 VL (free)	NVIDIA	$0.00	$0.00	128K	0.0	OSS
11	Qianfan-OCR-Fast (free)	baidu	$0.00	$0.00	66K	0.0	Closed
12	Qwen3.6 Plus (free)	Alibaba Qwen	$0.00	$0.00	1.0M	0.0	Closed
13	Gemma 3 12B	Google DeepMind	$0.04	$0.13	131K	0.0	OSS
14	Gemma 3 4B	Google DeepMind	$0.04	$0.08	131K	0.0	OSS
15	GPT-5 Nano	OpenAI	$0.05	$0.40	400K	45.3	Closed
16	Gemma 4 26B A4B	Google DeepMind	$0.06	$0.33	262K	0.0	OSS
17	Nova Lite 1.0	Amazon	$0.06	$0.24	300K	0.0	Closed
18	Qwen3.5-Flash	Alibaba Qwen	$0.07	$0.26	1.0M	0.0	OSS
19	Gemini 2.0 Flash Lite	Google DeepMind	$0.07	$0.30	1.0M	64.2	Closed
20	Mistral Small 3.2 24B	Mistral AI	$0.07	$0.20	128K	0.0	OSS
21	Seed 1.6 Flash	ByteDance	$0.07	$0.30	262K	0.0	Closed
22	Gemma 3 27B	Google DeepMind	$0.08	$0.16	131K	42.2	OSS
23	Llama 4 Scout	Meta	$0.08	$0.30	328K	18.9	OSS
24	Qwen3 VL 8B Instruct	Alibaba Qwen	$0.08	$0.50	131K	0.0	OSS
25	Gemini 2.0 Flash	Google DeepMind	$0.10	$0.40	1.0M	48.0	Closed
26	Gemini 2.5 Flash Lite	Google DeepMind	$0.10	$0.40	1.0M	59.1	Closed
27	Gemini 2.5 Flash Lite Preview 09-2025	Google DeepMind	$0.10	$0.40	1.0M	0.0	Closed
28	GPT-4.1 Nano	OpenAI	$0.10	$0.40	1.0M	35.2	Closed
29	Ministral 3 3B 2512	Mistral AI	$0.10	$0.10	131K	0.0	OSS
30	Qwen3.5-9B	Alibaba Qwen	$0.10	$0.15	262K	0.0	OSS

Top 3 cheapest vision LLMs

Cheapest vision model

Free Models Router accepts images at $0.00 per 1M input tokens, the lowest in our ranking. Good for batch OCR, chart reading, and product image analysis.

Gemma 3 12B (free) keeps vision input cheap while maintaining solid multimodal benchmark scores. Drop-in for most document workloads.

Best quality per dollar

Gemma 3 27B (free) keeps vision input cheap while maintaining solid multimodal benchmark scores. Drop-in for most document workloads.

The price gap · cheapest vs most expensive

Cheapest

Free Models Router

$0.00/M

$ per 1M input tokens

Why the gap

Premium vision models pay for higher resolution encoders, better chart and table reading, and longer context for multi-page documents. For batch OCR, the cheap end is often good enough.

Most expensive

Gemini 2.0 Flash

$0.10/M

$ per 1M input tokens

Frequently asked questions

Most providers bill images as a fixed token count based on resolution. OpenAI bills ~255 tokens per low-res image, 765+ for high-res. Anthropic bills by image dimensions. Google bills flat per image. Always check the provider docs before budgeting.

Cheapest vision LLMs

Ranked by input price

Top 3 cheapest vision LLMs

The price gap · cheapest vs most expensive

Frequently asked questions

See also

Related pricing

Stacks

Compare