ARC-AGI
ARC-AGI · the original Abstraction and Reasoning Corpus, testing whether AI can solve novel visual pattern recognition tasks without memorization.
The Frontier
Best score over time · one chart, every benchmark
Distribution
Where models cluster
Correlated benchmarks
Pearson r · original research
Benchmarks that track with ARC-AGI
Pearson correlation across models scored on both benchmarks. Closer to 1 = strongly predictive.
Full rankings
48 models tested · sorted by score
| # | Model | Score |
|---|---|---|
| 1 | 98.0 | |
| 2 | 94.5 | |
| 3 | 94.0 | |
| 4 | 93.7 | |
| 5 | 90.5 | |
| 6 | 86.5 | |
| 7 | 86.2 | |
| 8 | 80.0 | |
| 9 | 75.0 | |
| 10 | 72.8 | |
| 11 | 70.2 | |
| 12 | 66.7 | |
| 13 | 65.7 | |
| 14 | 65.3 | |
| 15 | 63.7 | |
| 16 | 63.7 | |
| 17 | 60.8 | |
| 18 | 59.3 | |
| 19 | 58.7 | |
| 20 | 57.0 | |
| 21 | 54.3 | |
| 22 | 48.5 | |
| 23 | 47.7 | |
| 24 | 44.7 | |
| 25 | 41.0 | |
| 26 | 40.0 | |
| 27 | 35.7 | |
| 28 | 34.5 | |
| 29 | 32.3 | |
| 30 | 30.7 | |
| 31 | 28.6 | |
| 32 | 21.5 | |
| 33 | 21.2 | |
| 34 | 20.7 | |
| 35 | 18.0 | |
| 36 | 16.5 | |
| 37 | 15.8 | |
| 38 | 14.0 | |
| 39 | 11.0 | |
| 40 | 10.3 | |
| 41 | 5.5 | |
| 42 | 5.5 | |
| 43 | U Magistral Small 1.1 | 5.0 |
| 44 | 4.5 | |
| 45 | 4.4 | |
| 46 | 3.5 | |
| 47 | 0.5 | |
| 48 | 0.1 |
Frequently asked
Pulled from the ARC-AGI dataset · updated daily
What does ARC-AGI measure?
ARC-AGI is a reasoning benchmark in the BenchGecko catalog. 48 AI models have been tested on it. Scores range from 0.1 to 98.0 out of 100.
Which model leads on ARC-AGI?
Gemini 3.1 Pro Preview from Google DeepMind leads ARC-AGI with a score of 98.0. The median score across 48 tested models is 42.8.
Is ARC-AGI saturated?
Yes · the top model on ARC-AGI has reached 98.0 out of 100, within 5% of the theoretical ceiling. This benchmark is approaching saturation and may be replaced by a harder successor.
Does ARC-AGI predict performance on other benchmarks?
Yes · ARC-AGI scores correlate 0.94 with Cybench across 13 shared models. Models that do well on ARC-AGI tend to do well on Cybench.
How often is ARC-AGI data refreshed?
BenchGecko pulls updates daily. New model scores on ARC-AGI appear as soon as they are published by Epoch AI or the model provider.
More reasoning benchmarks
Same category · related evaluations