Beta

Best AI Models for Vision

AI models ranked by vision and multimodal benchmarks. Compare MMMU, VideoMME, and visual reasoning scores.

27
Models
4
Providers
1
Open Source
$1.75
Median $/1M in
90+ Gold 80-89 70-79 60-69 <60Scores in % unless noted. Avg = unweighted mean across tested benchmarks.

Models ranked by visual understanding across MMMU, VideoMME, and other multimodal benchmarks. These tests measure image comprehension, visual reasoning, and video understanding.

Which AI model is best for vision tasks?

Vision capabilities vary across models. The leaderboard above ranks multimodal models by MMMU, VideoMME, and other visual benchmarks.

What is MMMU?

MMMU (Massive Multi-discipline Multimodal Understanding) tests models on college-level questions requiring both image understanding and domain knowledge across 30+ subjects.