SimpleBench
SimpleBench · tests fundamental reasoning capabilities with straightforward problems designed to expose gaps in basic logical and spatial thinking.
The Frontier
Best score over time · one chart, every benchmark
Full rankings
52 models tested · sorted by score
| # | Model | Score |
|---|---|---|
| 1 | 75.5 | |
| 2 | 71.7 | |
| 3 | 68.9 | |
| 4 | 61.1 | |
| 5 | 54.9 | |
| 6 | 54.4 | |
| 7 | 53.9 | |
| 8 | 53.3 | |
| 9 | 52.6 | |
| 10 | 52.0 | |
| 11 | 50.6 | |
| 12 | 48.9 | |
| 13 | 48.0 | |
| 14 | 45.2 | |
| 15 | 43.8 | |
| 16 | 43.8 | |
| 17 | 43.7 | |
| 18 | 37.2 | |
| 19 | 36.2 | |
| 20 | 35.7 | |
| 21 | 35.0 | |
| 22 | 34.6 | |
| 23 | 30.0 | |
| 24 | 29.4 | |
| 25 | 29.0 | |
| 26 | 28.1 | |
| 27 | 28.0 | |
| 28 | 26.4 | |
| 29 | 23.3 | |
| 30 | 21.4 | |
| 31 | 17.3 | |
| 32 | 17.2 | |
| 33 | 17.1 | |
| 34 | 16.8 | |
| 35 | 13.2 | |
| 36 | 13.0 | |
| 37 | 12.5 | |
| 38 | 12.4 | |
| 39 | 11.6 | |
| 40 | 10.1 | |
| 41 | 8.2 | |
| 42 | 7.6 | |
| 43 | 7.4 | |
| 44 | 7.2 | |
| 45 | 7.0 | |
| 46 | 7.0 | |
| 47 | 6.5 | |
| 48 | 3.9 | |
| 49 | 2.7 | |
| 50 | 1.7 | |
| 51 | 1.4 | |
| 52 | 1.4 |
Score distribution
Where models cluster
Correlated benchmarks
Pearson r · original research
Benchmarks that track with SimpleBench
Pearson correlation across models scored on both benchmarks. Closer to 1 = strongly predictive.
Frequently asked
About SimpleBench
What does SimpleBench measure?
SimpleBench · tests fundamental reasoning capabilities with straightforward problems designed to expose gaps in basic logical and spatial thinking. 52 AI models have been tested on it. Scores range from 1.4 to 75.5 out of 100.
Which model leads on SimpleBench?
Gemini 3.1 Pro Preview from Google DeepMind leads SimpleBench with a score of 75.5. The median score across 52 tested models is 28.1.
Is SimpleBench saturated?
No · the top score is 75.5 out of 100 (76%). There is still meaningful room for improvement on SimpleBench.
Does SimpleBench predict performance on other benchmarks?
Yes · SimpleBench scores correlate 0.95 with Chatbot Arena Elo · Overall across 25 shared models. Models that do well on SimpleBench tend to do well on Chatbot Arena Elo · Overall.
How often is SimpleBench data refreshed?
BenchGecko pulls updates daily. New model scores on SimpleBench appear as soon as they are published by Epoch AI or the model provider.
- Category
- Reasoning
- Max score
- 100
- Models
- 52
- Updated
- 2026-03-05
Top on SimpleBench
Gemini 3.1 Pro Preview · 75.5Gemini 3 Pro · 71.7GPT-5.4 Pro · 68.9Claude Opus 4.6 · 61.1Gemini 2.5 Pro · 54.9More reasoning benchmarks
Same category · related evaluations