Benchmark · Knowledge

Chatbot Arena Elo · Overall

Updated 2026-04-07

Models tested

113

Top score

1502.8

Claude Opus 4.6 (Fast)

Median

1363.2

min 970.9

Top-5 spread

σ 9.9

wide open

The Frontier

Best score over time · one chart, every benchmark

Chart type

Frontier on Chatbot Arena Elo · Overall rose from 1345.1 to 1502.8 in 23 months · +157.7 points · latest leader Claude Opus 4.6 (Fast) from Anthropic.

Pink dots = frontier records · 9 totalClick to open model page

Distribution

Where models cluster

Correlated benchmarks

Pearson r · original research

Correlation analysis

Benchmarks that track with Chatbot Arena Elo · Overall

Pearson correlation across models scored on both benchmarks. Closer to 1 = strongly predictive.

GPQA diamondReasoning

Full rankings

113 models tested · sorted by score

#	Model	Score	Price
1	Claude Opus 4.6 (Fast)· Anthropic	1502.8	$30.00
2	Claude Opus 4.6· Anthropic	1496.6	$5.00
3	Gemini 3.1 Pro Preview· Google DeepMind	1492.6	$2.00
4	Gemini 3 Pro· Google DeepMind	1486.2	—
5	Gemini 3 Flash Preview· Google DeepMind	1473.9	$0.50
6	Claude Opus 4.5· Anthropic	1467.7	$5.00
7	GLM 5.1· z-ai	1467.4	$0.95
8	GPT-5.4· OpenAI	1465.8	$2.50
9	Claude Sonnet 4.6· Anthropic	1462.2	$3.00
10	GLM 5· z-ai	1455.6	$0.72
11	Gemma 4 31B (free)· Google DeepMind	1451.2	$0.00
12	Gemini 2.5 Pro· Google DeepMind	1448.2	$1.25
13	Qwen3.5 397B A17B· Alibaba Qwen	1447.7	$0.39
14	MiMo-V2-Pro· xiaomi	1445.0	$1.00
15	GLM 4.7· z-ai	1442.7	$0.39
16	GPT-5.2· OpenAI	1439.5	$1.75
17	GPT-5.1· OpenAI	1438.5	$1.25
18	Gemma 4 26B A4B (free)· Google DeepMind	1437.9	$0.00
19	Gemini 3.1 Flash Lite Preview· Google DeepMind	1435.5	$0.25
20	GPT-5 Chat· OpenAI	1426.0	$1.25
21	GLM 4.6· z-ai	1425.8	$0.39
22	DeepSeek V3.2· DeepSeek	1424.4	$0.26
23	DeepSeek V3.2 Exp· DeepSeek	1422.8	$0.27
24	Qwen3 235B A22B Instruct 2507· Alibaba Qwen	1422.6	$0.07
25	R1 0528· DeepSeek	1421.7	$0.50
26	DeepSeek V3.1· DeepSeek	1417.9	$0.15
27	Qwen3.5-122B-A10B· Alibaba Qwen	1416.8	$0.26
28	Qwen3 VL 235B A22B Instruct· Alibaba Qwen	1415.8	$0.20
29	DeepSeek V3.1 Terminus· DeepSeek	1415.7	$0.21
30	Gemini 2.5 Flash· Google DeepMind	1411.0	$0.30
31	GLM 4.5· z-ai	1410.9	$0.60
32	MiniMax M2.5· minimax	1404.4	$0.12
33	Qwen3.5-27B· Alibaba Qwen	1403.9	$0.20
34	MiniMax M2.7· minimax	1403.0	$0.30
35	Qwen3 Next 80B A3B Instruct· Alibaba Qwen	1401.6	$0.09
36	LongCat Flash Chat· meituan	1401.1	$0.20
37	Qwen3.5-Flash· Alibaba Qwen	1400.6	$0.07
38	Qwen3 235B A22B Thinking 2507· Alibaba Qwen	1399.8	$0.15
39	R1· DeepSeek	1397.5	$0.70
40	Qwen3.5-35B-A3B· Alibaba Qwen	1396.8	$0.16
41	Qwen3 VL 235B A22B Thinking· Alibaba Qwen	1395.5	$0.26
42	DeepSeek V3 0324· DeepSeek	1394.6	$0.20
43	MiMo-V2-Flash· xiaomi	1392.1	$0.09
44	Step 3.5 Flash· stepfun	1391.4	$0.10
45	o1-preview· OpenAI	1387.7	—
46	Qwen3 30B A3B Instruct 2507· Alibaba Qwen	1383.1	$0.09
47	GLM 4.6V· z-ai	1377.9	$0.30
48	Trinity Large Preview (free)· arcee-ai	1375.1	$0.00
49	Qwen3 235B A22B· Alibaba Qwen	1374.4	$0.46
50	Qwen2.5-Max· Alibaba Qwen	1374.2	—
51	GLM 4.5 Air· z-ai	1372.8	$0.13
52	Claude 3.5 Sonnet· Anthropic	1371.4	—
53	Qwen3 Next 80B A3B Thinking· Alibaba Qwen	1369.0	$0.10
54	GLM 4.7 Flash· z-ai	1368.7	$0.06
55	Gemma 3 27B· Google DeepMind	1365.1	$0.08
56	MiniMax M1· minimax	1363.3	$0.40
57	o3 Mini High· OpenAI	1363.2	$1.10
58	Gemini 2.0 Flash· Google DeepMind	1360.0	$0.10
59	DeepSeek V3· DeepSeek	1358.2	$0.32
60	Grok 3 Mini Beta· xAI	1357.4	$0.30
61	INTELLECT-3· prime-intellect	1356.2	$0.20
62	gpt-oss-120b· OpenAI	1353.8	$0.04
63	GLM 4.5V· z-ai	1353.3	$0.60
64	o3 Mini· OpenAI	1347.5	$1.10
65	Qwen3 32B· Alibaba Qwen	1347.0	$0.08
66	Mercury 2· inception	1347.0	$0.25
67	Llama 3.1 Nemotron Ultra 253B v1· NVIDIA	1346.8	$0.60
68	MiniMax M2· minimax	1346.6	$0.26
69	GPT-4o (2024-05-13)· OpenAI	1345.1	$5.00
70	Gemma 3 12B· Google DeepMind	1341.4	$0.04
71	Nova 2 Lite· Amazon	1337.7	$0.30
72	o1-mini· OpenAI	1336.6	—
73	QwQ 32B· Alibaba Qwen	1335.7	$0.15
74	GPT-4o (2024-08-06)· OpenAI	1334.3	$2.50
75	Olmo 3.1 32B Instruct· allenai	1330.9	$0.20
76	Qwen3 30B A3B· Alibaba Qwen	1327.3	$0.08
77	Molmo2 8B· allenai	1326.3	—
78	Gemini 1.5 Pro (Feb 2024)· Google DeepMind	1322.5	—
79	Llama 4 Scout 17B 16E Instruct· Meta	1321.9	—
80	Gemma 3n 4B· Google DeepMind	1318.0	$0.02
81	Llama 3.3 70B Instruct· Meta	1318.0	$0.10
82	NVIDIA Nemotron 3 Nano 30B A3B BF16· NVIDIA	1317.8	—
83	gpt-oss-20b· OpenAI	1317.7	$0.03
84	GPT-4o-mini (2024-07-18)· OpenAI	1317.2	$0.15
85	Mistral Large 2407· Mistral AI	1313.3	$2.00
86	GPT-4 Turbo (older v1106)· OpenAI	1312.0	$10.00
87	Mercury· inception	1306.1	$0.25
88	Olmo 3 32B Think· allenai	1305.3	$0.15
89	Mistral Large 2411· Mistral AI	1304.7	$2.00
90	Gemma 3 4B· Google DeepMind	1302.8	$0.04
91	Qwen2.5 72B Instruct· Alibaba Qwen	1302.3	$0.12
92	Llama 3.1 Nemotron 70B Instruct· NVIDIA	1298.5	$1.20
93	Llama 3.1 70B Instruct· Meta	1292.8	$0.40
94	Gemma 2 27B· Google DeepMind	1287.6	$0.65
95	GPT-4 (older v0314)· OpenAI	1285.8	$30.00
96	Olmo 3.1 32B Think· allenai	1285.6	$0.15
97	Gemini 1.5 Flash (May 2024)· Google DeepMind	1285.1	—
98	Command R+ (08-2024)· Cohere	1275.5	$2.50
99	Llama 3 70B Instruct· Meta	1275.1	$0.51
100	Mistral Small 3· Mistral AI	1273.5	$0.05
101	Qwen2.5 Coder 32B Instruct· Alibaba Qwen	1269.9	$0.66
102	Gemma 2 9B· Google DeepMind	1265.0	$0.03
103	Phi 4· Microsoft	1255.4	$0.07
104	Olmo 2 32B Instruct· allenai	1251.2	$0.05
105	Command R (08-2024)· Cohere	1249.1	$0.15
106	Llama 3 8B Instruct· Meta	1222.2	$0.03
107	Llama 3.1 8B Instruct· Meta	1211.0	$0.02
108	Gemma 2 2b It· Google DeepMind	1198.5	—
109	Llama 3.2 3B Instruct· Meta	1165.7	$0.05
110	Mistral 7B Instruct V0.2· Mistral AI	1148.3	—
111	Phi 3 Mini 4k Instruct· Microsoft	1127.2	—
112	Llama 3.2 1B Instruct· Meta	1110.2	$0.03
113	LLaMA-13B· Meta	970.9	—

Frequently asked

Pulled from the Chatbot Arena Elo · Overall dataset · updated daily

What does Chatbot Arena Elo · Overall measure?

Chatbot Arena Elo · Overall is a knowledge benchmark in the BenchGecko catalog. 113 AI models have been tested on it. Scores range from 970.9 to 1502.8 out of 1600.

Which model leads on Chatbot Arena Elo · Overall?

Claude Opus 4.6 (Fast) from Anthropic leads Chatbot Arena Elo · Overall with a score of 1502.8. The median score across 113 tested models is 1363.2.

Is Chatbot Arena Elo · Overall saturated?

No · the top score is 1502.8 out of 1600 (94%). There is still meaningful room for improvement on Chatbot Arena Elo · Overall.

Does Chatbot Arena Elo · Overall predict performance on other benchmarks?

Yes · Chatbot Arena Elo · Overall scores correlate 0.98 with GSM8K across 7 shared models. Models that do well on Chatbot Arena Elo · Overall tend to do well on GSM8K.

How often is Chatbot Arena Elo · Overall data refreshed?

BenchGecko pulls updates daily. New model scores on Chatbot Arena Elo · Overall appear as soon as they are published by Epoch AI or the model provider.