Cheapest vision LLMs
The cheapest multimodal models that accept images. Ranked by input token price, with notes on per-image billing and OCR quality.
Ranked by input price
Multimodal models only, cheapest first.
Top 3 cheapest vision LLMs
Free Models Router accepts images at $0.00 per 1M input tokens, the lowest in our ranking. Good for batch OCR, chart reading, and product image analysis.
Gemma 3 12B (free) keeps vision input cheap while maintaining solid multimodal benchmark scores. Drop-in for most document workloads.
Gemma 3 27B (free) keeps vision input cheap while maintaining solid multimodal benchmark scores. Drop-in for most document workloads.
The price gap · cheapest vs most expensive
Premium vision models pay for higher resolution encoders, better chart and table reading, and longer context for multi-page documents. For batch OCR, the cheap end is often good enough.