API
Benchmarks/Aider polyglot

Aider polyglot

Aider Polyglot β€” measures how well AI models can edit code across multiple programming languages using the Aider coding assistant framework.

55
Models Tested
88.0
Top Score
44.0
Average Score
1OpenAIOpenAI88.0
2OpenAIOpenAI88.0
3OpenAIOpenAI84.9
4Google DeepMindGoogle DeepMind83.1
5OpenAIOpenAI81.3
6xAIxAI79.6
7DeepSeekDeepSeek74.2
8DeepSeekDeepSeek74.2
9AnthropicAnthropic72.0
10OpenAIOpenAI72.0
11DeepSeekDeepSeek71.4
12AnthropicAnthropic64.9
13AnthropicAnthropic64.9
14OpenAIOpenAI61.7
15AnthropicAnthropic61.3
16OpenAIOpenAI60.4
17Alibaba QwenAlibaba Qwen59.6
18AlibabaAlibaba59.6
19
M
moonshotai
59.1
20
M
moonshotai
59.1
21DeepSeekDeepSeek56.9
22xAIxAI53.3
23xAIxAI53.3
24OpenAIOpenAI52.4
25xAIxAI49.3
26xAIxAI49.3
27DeepSeekDeepSeek48.4
28Google DeepMindGoogle DeepMind47.1
29OpenAIOpenAI44.9
30OpenAIOpenAI41.8
31OpenAIOpenAI41.8
32OpenAIOpenAI41.8
33OpenAIOpenAI41.8
34Google DeepMindGoogle DeepMind38.2
35GoogleGoogle35.6
36OpenAIOpenAI32.9
37OpenAIOpenAI32.4
38AnthropicAnthropic28.0
39OpenAIOpenAI23.1
40OpenAIOpenAI23.1
41OpenAIOpenAI23.1
42OpenAIOpenAI23.1
43AlibabaAlibaba21.8
44OpenAIOpenAI18.2
45GoogleGoogle18.2
46MetaMeta15.6
47OpenAIOpenAI8.9
48Google DeepMindGoogle DeepMind4.9
49Google DeepMindGoogle DeepMind4.9
50Google DeepMindGoogle DeepMind4.9
51Google DeepMindGoogle DeepMind4.9
52Google DeepMindGoogle DeepMind4.9
53Google DeepMindGoogle DeepMind4.9
54OpenAIOpenAI3.6
55OpenAIOpenAI3.6