Beta
Live11 categories · 971 models tracked

Models by Category.
Ranked.

Every model category we track · one leaderboard per task. Pick the skill you care about and see which models dominate it.

84 models

AI models ranked by coding benchmarks. Compare HumanEval+, SWE-bench Verified, Aider Polyglot, and more across all providers.

Top 3
1GPT-5 Chat
81.9
2Claude Mythos Preview
81.8
3Gemini 2.5 Pro Preview 05-06
76.9
133 models

AI models ranked by reasoning benchmarks. Compare GPQA Diamond, ARC-AGI, BBH, and other reasoning tests across all providers.

Top 3
1Claude Mythos Preview
81.8
2DeepSeek-V2 (MoE-236B, May 2024)
76.5
3phi-3-small 7.4B
67.4
120 models

AI models ranked by math benchmarks. Compare MATH-500, GSM8K, and competition-level math scores across all providers.

Top 3
1Claude Instant
78.0
2GPT-5.4 Pro
66.7
3Qwen-14B
60.7
125 models

AI models ranked by knowledge benchmarks. Compare MMLU-Pro, GPQA Diamond, SimpleQA, and other knowledge tests.

Top 3
1Claude Mythos Preview
81.8
2Claude Instant
78.0
3DeepSeek-V2 (MoE-236B, May 2024)
76.5
27 models

AI models ranked by vision and multimodal benchmarks. Compare MMMU, VideoMME, and visual reasoning scores.

Top 3
1Gemini 3 Pro
60.5
2o1
56.4
3Gemini 2.5 Pro
56.2
142 models

Open-source AI models ranked by benchmark score. Compare Llama, Mistral, DeepSeek, Qwen, and other open-weight models.

Top 3
1Qwen3.5 397B A17B
78.4
2DeepSeek V3.2 Speciale
78.2
3Step 3.5 Flash
76.9
92 models

Cheapest AI models ranked by score. All models with input pricing under $1 per million tokens, sorted by benchmark performance.

Top 3
1Qwen3.5 397B A17B
78.4
2DeepSeek V3.2 Speciale
78.2
3Step 3.5 Flash
76.9
130 models

AI models under $5 per million tokens ranked by benchmark score. The sweet spot of price and performance.

Top 3
1GPT-5 Chat
81.9
2Qwen3.5 397B A17B
78.4
3DeepSeek V3.2 Speciale
78.2
25 models

AI models with 1 million+ token context windows ranked by score. Compare Gemini, Claude, and other long-context models.

Top 3
1Claude Mythos Preview
81.8
2Gemini 2.5 Pro Preview 05-06
76.9
3Qwen3.6 Plus
70.9
84 models

Small AI models (under 10B parameters) ranked by benchmark score. Lightweight models you can run locally.

Top 3
1Gemini 2.5 Pro Preview 05-06
76.9
2o4 Mini High
72.0
3MiniMax M2
69.5
241 models

The best AI model from each provider, ranked by benchmark score. Compare the flagships from OpenAI, Anthropic, Google, Meta, and more.

Top 3
1GPT-5 Chat
81.9
2Claude Mythos Preview
81.8
3Qwen3.5 397B A17B
78.4