HELM · MMLU-Pro
The Frontier
Best score over time · one chart, every benchmark
Distribution
Where models cluster
Correlated benchmarks
Pearson r · original research
Benchmarks that track with HELM · MMLU-Pro
Pearson correlation across models scored on both benchmarks. Closer to 1 = strongly predictive.
Full rankings
34 models tested · sorted by score
| # | Model | Score |
|---|---|---|
| 1 | 90.3 | |
| 2 | 86.3 | |
| 3 | 86.3 | |
| 4 | 85.9 | |
| 5 | 85.1 | |
| 6 | 83.5 | |
| 7 | 82.0 | |
| 8 | 81.9 | |
| 9 | 81.1 | |
| 10 | 80.4 | |
| 11 | 79.9 | |
| 12 | 79.5 | |
| 13 | 79.3 | |
| 14 | 78.8 | |
| 15 | 78.6 | |
| 16 | 78.4 | |
| 17 | 78.3 | |
| 18 | 77.8 | |
| 19 | 77.7 | |
| 20 | 74.0 | |
| 21 | 73.7 | |
| 22 | 73.7 | |
| 23 | 72.3 | |
| 24 | 72.0 | |
| 25 | 71.3 | |
| 26 | 67.8 | |
| 27 | 63.9 | |
| 28 | 61.0 | |
| 29 | 60.5 | |
| 30 | 60.3 | |
| 31 | 59.9 | |
| 32 | 57.9 | |
| 33 | 55.0 | |
| 34 | 53.7 |
Frequently asked
Pulled from the HELM · MMLU-Pro dataset · updated daily
What does HELM · MMLU-Pro measure?
HELM · MMLU-Pro is a knowledge benchmark in the BenchGecko catalog. 34 AI models have been tested on it. Scores range from 53.7 to 90.3 out of 100.
Which model leads on HELM · MMLU-Pro?
Gemini 3 Pro from Google DeepMind leads HELM · MMLU-Pro with a score of 90.3. The median score across 34 tested models is 78.0.
Is HELM · MMLU-Pro saturated?
No · the top score is 90.3 out of 100 (90%). There is still meaningful room for improvement on HELM · MMLU-Pro.
Does HELM · MMLU-Pro predict performance on other benchmarks?
Yes · HELM · MMLU-Pro scores correlate 0.94 with HELM · GPQA across 34 shared models. Models that do well on HELM · MMLU-Pro tend to do well on HELM · GPQA.
How often is HELM · MMLU-Pro data refreshed?
BenchGecko pulls updates daily. New model scores on HELM · MMLU-Pro appear as soon as they are published by Epoch AI or the model provider.
Top on HELM · MMLU-Pro
Gemini 3 Pro · 90.3Gemini 2.5 Pro · 86.3GPT-5 Chat · 86.3o3 · 85.9Grok 4 · 85.1More knowledge benchmarks
Same category · related evaluations