Beta
Benchmark · Knowledge

Chatbot Arena Elo · Overall

Updated 2026-04-07
Models tested
113
Top score
1502.8
Claude Opus 4.6 (Fast)
Median
1363.2
min 970.9
Top-5 spread
σ 9.9
wide open

Best score over time · one chart, every benchmark

CHATBOT ARENA ELO · OVERALL102 MODELS · FRONTIER RUNNING MAX040080012001600SCORE ↑Apr 24Oct 24Apr 25Oct 25Apr 26RELEASE DATE →benchgecko.ai/benchmark/arena-elo-overall · frontier
Frontier on Chatbot Arena Elo · Overall rose from 1345.1 to 1502.8 in 23 months · +157.7 points · latest leader Claude Opus 4.6 (Fast) from Anthropic.
Pink dots = frontier records · 9 totalClick to open model page

Where models cluster

SCORE DISTRIBUTION0–160160–320320–480480–640640–800800–9602960–1120141120–1280821280–1440151440–1600MEDIAN · 1363.2SCORE BUCKET → (0 TO 1600)MODELSbenchgecko.ai

Pearson r · original research

Correlation analysis

Benchmarks that track with Chatbot Arena Elo · Overall

Pearson correlation across models scored on both benchmarks. Closer to 1 = strongly predictive.

113 models tested · sorted by score

#ModelScore
1Anthropic logoClaude Opus 4.6 (Fast)1502.8
2Anthropic logoClaude Opus 4.61496.6
3Google DeepMind logoGemini 3.1 Pro Preview1492.6
4Google DeepMind logoGemini 3 Pro1486.2
5Google DeepMind logoGemini 3 Flash Preview1473.9
6Anthropic logoClaude Opus 4.51467.7
7z-ai logoGLM 5.11467.4
8OpenAI logoGPT-5.41465.8
9Anthropic logoClaude Sonnet 4.61462.2
10z-ai logoGLM 51455.6
11Google DeepMind logoGemma 4 31B (free)1451.2
12Google DeepMind logoGemini 2.5 Pro1448.2
13Alibaba Qwen logoQwen3.5 397B A17B1447.7
14xiaomi logoMiMo-V2-Pro1445.0
15z-ai logoGLM 4.71442.7
16OpenAI logoGPT-5.21439.5
17OpenAI logoGPT-5.11438.5
18Google DeepMind logoGemma 4 26B A4B (free)1437.9
19Google DeepMind logoGemini 3.1 Flash Lite Preview1435.5
20OpenAI logoGPT-5 Chat1426.0
21z-ai logoGLM 4.61425.8
22DeepSeek logoDeepSeek V3.21424.4
23DeepSeek logoDeepSeek V3.2 Exp1422.8
24Alibaba Qwen logoQwen3 235B A22B Instruct 25071422.6
25DeepSeek logoR1 05281421.7
26DeepSeek logoDeepSeek V3.11417.9
27Alibaba Qwen logoQwen3.5-122B-A10B1416.8
28Alibaba Qwen logoQwen3 VL 235B A22B Instruct1415.8
29DeepSeek logoDeepSeek V3.1 Terminus1415.7
30Google DeepMind logoGemini 2.5 Flash1411.0
31z-ai logoGLM 4.51410.9
32minimax logoMiniMax M2.51404.4
33Alibaba Qwen logoQwen3.5-27B1403.9
34minimax logoMiniMax M2.71403.0
35Alibaba Qwen logoQwen3 Next 80B A3B Instruct1401.6
36meituan logoLongCat Flash Chat1401.1
37Alibaba Qwen logoQwen3.5-Flash1400.6
38Alibaba Qwen logoQwen3 235B A22B Thinking 25071399.8
39DeepSeek logoR11397.5
40Alibaba Qwen logoQwen3.5-35B-A3B1396.8
41Alibaba Qwen logoQwen3 VL 235B A22B Thinking1395.5
42DeepSeek logoDeepSeek V3 03241394.6
43xiaomi logoMiMo-V2-Flash1392.1
44stepfun logoStep 3.5 Flash1391.4
45OpenAI logoo1-preview1387.7
46Alibaba Qwen logoQwen3 30B A3B Instruct 25071383.1
47z-ai logoGLM 4.6V1377.9
48arcee-ai logoTrinity Large Preview (free)1375.1
49Alibaba Qwen logoQwen3 235B A22B1374.4
50Alibaba Qwen logoQwen2.5-Max1374.2
51z-ai logoGLM 4.5 Air1372.8
52Anthropic logoClaude 3.5 Sonnet1371.4
53Alibaba Qwen logoQwen3 Next 80B A3B Thinking1369.0
54z-ai logoGLM 4.7 Flash1368.7
55Google DeepMind logoGemma 3 27B1365.1
56minimax logoMiniMax M11363.3
57OpenAI logoo3 Mini High1363.2
58Google DeepMind logoGemini 2.0 Flash1360.0
59DeepSeek logoDeepSeek V31358.2
60xAI logoGrok 3 Mini Beta1357.4
61prime-intellect logoINTELLECT-31356.2
62OpenAI logogpt-oss-120b1353.8
63z-ai logoGLM 4.5V1353.3
64OpenAI logoo3 Mini1347.5
65Alibaba Qwen logoQwen3 32B1347.0
66inception logoMercury 21347.0
67NVIDIA logoLlama 3.1 Nemotron Ultra 253B v11346.8
68minimax logoMiniMax M21346.6
69OpenAI logoGPT-4o (2024-05-13)1345.1
70Google DeepMind logoGemma 3 12B1341.4
71Amazon logoNova 2 Lite1337.7
72OpenAI logoo1-mini1336.6
73Alibaba Qwen logoQwQ 32B1335.7
74OpenAI logoGPT-4o (2024-08-06)1334.3
75allenai logoOlmo 3.1 32B Instruct1330.9
76Alibaba Qwen logoQwen3 30B A3B1327.3
77allenai logoMolmo2 8B1326.3
78Google DeepMind logoGemini 1.5 Pro (Feb 2024)1322.5
79Meta logoLlama 4 Scout 17B 16E Instruct1321.9
80Google DeepMind logoGemma 3n 4B1318.0
81Meta logoLlama 3.3 70B Instruct1318.0
82NVIDIA logoNVIDIA Nemotron 3 Nano 30B A3B BF161317.8
83OpenAI logogpt-oss-20b1317.7
84OpenAI logoGPT-4o-mini (2024-07-18)1317.2
85Mistral AI logoMistral Large 24071313.3
86OpenAI logoGPT-4 Turbo (older v1106)1312.0
87inception logoMercury1306.1
88allenai logoOlmo 3 32B Think1305.3
89Mistral AI logoMistral Large 24111304.7
90Google DeepMind logoGemma 3 4B1302.8
91Alibaba Qwen logoQwen2.5 72B Instruct1302.3
92NVIDIA logoLlama 3.1 Nemotron 70B Instruct1298.5
93Meta logoLlama 3.1 70B Instruct1292.8
94Google DeepMind logoGemma 2 27B1287.6
95OpenAI logoGPT-4 (older v0314)1285.8
96allenai logoOlmo 3.1 32B Think1285.6
97Google DeepMind logoGemini 1.5 Flash (May 2024)1285.1
98Cohere logoCommand R+ (08-2024)1275.5
99Meta logoLlama 3 70B Instruct1275.1
100Mistral AI logoMistral Small 31273.5
101Alibaba Qwen logoQwen2.5 Coder 32B Instruct1269.9
102Google DeepMind logoGemma 2 9B1265.0
103Microsoft logoPhi 41255.4
104allenai logoOlmo 2 32B Instruct1251.2
105Cohere logoCommand R (08-2024)1249.1
106Meta logoLlama 3 8B Instruct1222.2
107Meta logoLlama 3.1 8B Instruct1211.0
108Google DeepMind logoGemma 2 2b It1198.5
109Meta logoLlama 3.2 3B Instruct1165.7
110Mistral AI logoMistral 7B Instruct V0.21148.3
111Microsoft logoPhi 3 Mini 4k Instruct1127.2
112Meta logoLlama 3.2 1B Instruct1110.2
113Meta logoLLaMA-13B970.9

Pulled from the Chatbot Arena Elo · Overall dataset · updated daily

What does Chatbot Arena Elo · Overall measure?

Chatbot Arena Elo · Overall is a knowledge benchmark in the BenchGecko catalog. 113 AI models have been tested on it. Scores range from 970.9 to 1502.8 out of 1600.

Which model leads on Chatbot Arena Elo · Overall?

Claude Opus 4.6 (Fast) from Anthropic leads Chatbot Arena Elo · Overall with a score of 1502.8. The median score across 113 tested models is 1363.2.

Is Chatbot Arena Elo · Overall saturated?

No · the top score is 1502.8 out of 1600 (94%). There is still meaningful room for improvement on Chatbot Arena Elo · Overall.

Does Chatbot Arena Elo · Overall predict performance on other benchmarks?

Yes · Chatbot Arena Elo · Overall scores correlate 0.98 with GSM8K across 7 shared models. Models that do well on Chatbot Arena Elo · Overall tend to do well on GSM8K.

How often is Chatbot Arena Elo · Overall data refreshed?

BenchGecko pulls updates daily. New model scores on Chatbot Arena Elo · Overall appear as soon as they are published by Epoch AI or the model provider.

Same category · related evaluations