Best listsData updated · May 6, 2026

Best AI Models by Benchmark Data

Decision pages for choosing models by task. Each page uses public benchmark scores, listed prices, context windows, and coverage confidence instead of vague claims.

Evidence first

Every ranking links back to model pages, benchmark pages, and visible scoring rules.

No fake testing

Pages use published benchmark data. Missing data is shown as missing, not guessed.

Scoped claims

A coding ranking means coding benchmarks. A math ranking means math benchmarks.

20+ ranked

Coding models ranked from published coding benchmark scores, listed prices, and model metadata tracked by BenchGecko.

Current leader
GPT-5 Chat
OpenAI
20+ ranked

Open-weight AI models ranked from available benchmark data, coverage confidence, pricing metadata, and listed license signals.

Current leader
R1 0528
DeepSeek
20+ ranked

Reasoning models ranked from public benchmark scores across GPQA Diamond, BBH, ARC-AGI, SimpleBench, and related tests.

Current leader
GPT-5.4 Pro
OpenAI
20+ ranked

Math models ranked from public benchmark scores across GSM8K, MATH-level tests, AIME-style tasks, and FrontierMath where available.

Current leader
GPT-5.4 Pro
OpenAI
20+ ranked

Multimodal models ranked from public benchmark scores across video, image, chart, and visual reasoning tests where available.

Current leader
Qwen2.5 72B Instruct
Alibaba Qwen
How these pages stay strict

Each ranking starts from a real decision: coding, reasoning, math, multimodal work, or open-weight deployment.

Every page has a visible method, caveat, model table, benchmark links, and data freshness label.

Unsupported claims are avoided. Rankings are shortlists from available evidence, not universal promises.