Aider · Code Editing
The Frontier
Best score over time · one chart, every benchmark
Distribution
Where models cluster
Correlated benchmarks
Pearson r · original research
Benchmarks that track with Aider · Code Editing
Pearson correlation across models scored on both benchmarks. Closer to 1 = strongly predictive.
Full rankings
27 models tested · sorted by score
| # | Model | Score |
|---|---|---|
| 1 | 84.2 | |
| 2 | 84.2 | |
| 3 | 79.7 | |
| 4 | 72.9 | |
| 5 | 71.4 | |
| 6 | 71.4 | |
| 7 | 71.4 | |
| 8 | 70.7 | |
| 9 | 69.2 | |
| 10 | 66.2 | |
| 11 | 65.4 | |
| 12 | 65.4 | |
| 13 | 65.4 | |
| 14 | 60.2 | |
| 15 | 59.4 | |
| 16 | 58.6 | |
| 17 | 58.6 | |
| 18 | 57.9 | |
| 19 | 57.1 | |
| 20 | 55.6 | |
| 21 | 55.6 | |
| 22 | 50.4 | |
| 23 | 44.4 | |
| 24 | 38.3 | |
| 25 | 37.6 | |
| 26 | 31.6 | |
| 27 | 14.3 |
Frequently asked
Pulled from the Aider · Code Editing dataset · updated daily
What does Aider · Code Editing measure?
Aider · Code Editing is a knowledge benchmark in the BenchGecko catalog. 27 AI models have been tested on it. Scores range from 14.3 to 84.2 out of 100.
Which model leads on Aider · Code Editing?
Claude 3.5 Sonnet from Anthropic leads Aider · Code Editing with a score of 84.2. The median score across 27 tested models is 60.2.
Is Aider · Code Editing saturated?
No · the top score is 84.2 out of 100 (84%). There is still meaningful room for improvement on Aider · Code Editing.
Does Aider · Code Editing predict performance on other benchmarks?
Yes · Aider · Code Editing scores correlate 0.94 with The Agent Company across 6 shared models. Models that do well on Aider · Code Editing tend to do well on The Agent Company.
How often is Aider · Code Editing data refreshed?
BenchGecko pulls updates daily. New model scores on Aider · Code Editing appear as soon as they are published by Epoch AI or the model provider.
Top on Aider · Code Editing
Claude 3.5 Sonnet · 84.2o1 · 84.2o1-preview · 79.7GPT-4o (2024-05-13) · 72.9GPT-4o (2024-08-06) · 71.4More knowledge benchmarks
Same category · related evaluations