Aider polyglot
Aider Polyglot · measures how well AI models can edit code across multiple programming languages using the Aider coding assistant framework.
The Frontier
Best score over time · one chart, every benchmark
Full rankings
53 models tested · sorted by score
Score distribution
Where models cluster
Correlated benchmarks
Pearson r · original research
Benchmarks that track with Aider polyglot
Pearson correlation across models scored on both benchmarks. Closer to 1 = strongly predictive.
Frequently asked
About Aider polyglot
What does Aider polyglot measure?
Aider Polyglot · measures how well AI models can edit code across multiple programming languages using the Aider coding assistant framework. 53 AI models have been tested on it. Scores range from 3.6 to 88.0 out of 100.
Which model leads on Aider polyglot?
GPT-5 from OpenAI leads Aider polyglot with a score of 88.0. The median score across 53 tested models is 52.4.
Is Aider polyglot saturated?
No · the top score is 88.0 out of 100 (88%). There is still meaningful room for improvement on Aider polyglot.
Does Aider polyglot predict performance on other benchmarks?
Yes · Aider polyglot scores correlate 0.96 with OpenCompass · MMLU-Pro across 8 shared models. Models that do well on Aider polyglot tend to do well on OpenCompass · MMLU-Pro.
How often is Aider polyglot data refreshed?
BenchGecko pulls updates daily. New model scores on Aider polyglot appear as soon as they are published by Epoch AI or the model provider.
- Category
- Code
- Max score
- 100
- Models
- 53
- Updated
- 2025-12-01
Top on Aider polyglot
GPT-5 · 88.0GPT-5 Chat · 88.0o3 Pro · 84.9Gemini 2.5 Pro · 83.1Gemini 2.5 Pro Preview 06-05 · 83.1More code benchmarks
Same category · related evaluations