Aider · Code Editing
The Frontier
Best score over time · one chart, every benchmark
전체 순위
27 모델 테스트 완료 · 점수 순 정렬
| # | 모델 | 점수 |
|---|---|---|
| 1 | 84.2 | |
| 2 | 84.2 | |
| 3 | 79.7 | |
| 4 | 72.9 | |
| 5 | 71.4 | |
| 6 | 71.4 | |
| 7 | 71.4 | |
| 8 | 70.7 | |
| 9 | 69.2 | |
| 10 | 66.2 | |
| 11 | 65.4 | |
| 12 | 65.4 | |
| 13 | 65.4 | |
| 14 | 60.2 | |
| 15 | 59.4 | |
| 16 | 58.6 | |
| 17 | 58.6 | |
| 18 | 57.9 | |
| 19 | 57.1 | |
| 20 | 55.6 | |
| 21 | 55.6 | |
| 22 | 50.4 | |
| 23 | 44.4 | |
| 24 | 38.3 | |
| 25 | 37.6 | |
| 26 | 31.6 | |
| 27 | 14.3 |
점수 분포
모델 밀집 구간
상관 벤치마크
Pearson r · 독자 연구
Benchmarks that track with Aider · Code Editing
Pearson correlation across models scored on both benchmarks. Closer to 1 = strongly predictive.
자주 묻는 질문
About Aider · Code Editing
What does Aider · Code Editing measure?
Aider · Code Editing is a knowledge benchmark in the BenchGecko catalog. 27 AI models have been tested on it. Scores range from 14.3 to 84.2 out of 100.
Which model leads on Aider · Code Editing?
Claude 3.5 Sonnet from Anthropic leads Aider · Code Editing with a score of 84.2. The median score across 27 tested models is 60.2.
Is Aider · Code Editing saturated?
No · the top score is 84.2 out of 100 (84%). There is still meaningful room for improvement on Aider · Code Editing.
Does Aider · Code Editing predict performance on other benchmarks?
Yes · Aider · Code Editing scores correlate 0.94 with The Agent Company across 6 shared models. Models that do well on Aider · Code Editing tend to do well on The Agent Company.
How often is Aider · Code Editing data refreshed?
BenchGecko pulls updates daily. New model scores on Aider · Code Editing appear as soon as they are published by Epoch AI or the model provider.
- 카테고리
- Knowledge
- 최대 점수
- 100
- 모델
- 27
- 업데이트
- 2025-04-15
Top on Aider · Code Editing
Claude 3.5 Sonnet · 84.2o1 · 84.2o1-preview · 79.7GPT-4o (2024-05-13) · 72.9GPT-4o (2024-08-06) · 71.4knowledge 벤치마크 더 보기
같은 카테고리 · 관련 평가