GeoBench
GeoBench · tests geographic knowledge and spatial reasoning across countries, landmarks, coordinates, and geopolitical understanding.
The Frontier
Best score over time · one chart, every benchmark
Full rankings
26 models tested · sorted by score
| # | Model | Score |
|---|---|---|
| 1 | 88.0 | |
| 2 | 84.0 | |
| 3 | 81.0 | |
| 4 | 81.0 | |
| 5 | 80.0 | |
| 6 | 77.0 | |
| 7 | 76.0 | |
| 8 | 75.0 | |
| 9 | 74.0 | |
| 10 | 73.0 | |
| 11 | 72.0 | |
| 12 | 71.0 | |
| 13 | 68.0 | |
| 14 | 64.0 | |
| 15 | 64.0 | |
| 16 | 64.0 | |
| 17 | 62.0 | |
| 18 | 62.0 | |
| 19 | 52.0 | |
| 20 | 52.0 | |
| 21 | 52.0 | |
| 22 | 52.0 | |
| 23 | 49.0 | |
| 24 | 45.0 | |
| 25 | 37.0 | |
| 26 | 34.0 |
Score distribution
Where models cluster
Correlated benchmarks
Pearson r · original research
Benchmarks that track with GeoBench
Pearson correlation across models scored on both benchmarks. Closer to 1 = strongly predictive.
Frequently asked
About GeoBench
What does GeoBench measure?
GeoBench · tests geographic knowledge and spatial reasoning across countries, landmarks, coordinates, and geopolitical understanding. 26 AI models have been tested on it. Scores range from 34.0 to 88.0 out of 100.
Which model leads on GeoBench?
Gemini 3 Flash Preview from Google DeepMind leads GeoBench with a score of 88.0. The median score across 26 tested models is 66.0.
Is GeoBench saturated?
No · the top score is 88.0 out of 100 (88%). There is still meaningful room for improvement on GeoBench.
Does GeoBench predict performance on other benchmarks?
Yes · GeoBench scores correlate 0.96 with Artificial Analysis · Agentic Index across 5 shared models. Models that do well on GeoBench tend to do well on Artificial Analysis · Agentic Index.
How often is GeoBench data refreshed?
BenchGecko pulls updates daily. New model scores on GeoBench appear as soon as they are published by Epoch AI or the model provider.
- Category
- Knowledge
- Max score
- 100
- Models
- 26
- Updated
- 2025-12-17
Top on GeoBench
Gemini 3 Flash Preview · 88.0Gemini 3 Pro · 84.0Gemini 2.5 Pro · 81.0GPT-5 · 81.0o1 · 80.0More knowledge benchmarks
Same category · related evaluations