#	Model	Score	Price
1	GPT-5.4· OpenAI	69.4	$2.50
2	Claude Opus 4.6 (Fast)· Anthropic	67.6	$30.00
3	GLM 5.1· z-ai	67.0	$0.95
4	GLM 5 Turbo· z-ai	63.1	$1.20
5	Claude Sonnet 4.6· Anthropic	63.0	$3.00
6	MiMo-V2-Pro· xiaomi	62.8	$1.00
7	GPT-5.3-Codex· OpenAI	62.2	$1.75
8	U Muse Spark· Unknown	62.0	—
9	Qwen3.6 Plus· Alibaba Qwen	61.7	$0.33
10	MiniMax M2.7· minimax	61.5	$0.30
11	GLM 5V Turbo· z-ai	61.1	$1.20
12	Gemini 3.1 Pro Preview· Google DeepMind	59.1	$2.00
13	Kimi K2.5· moonshotai	58.9	$0.38
14	MiMo-V2-Omni· xiaomi	58.6	$0.40
15	Qwen3.5 397B A17B· Alibaba Qwen	55.8	$0.39
16	GPT-5.4 Mini· OpenAI	55.7	$0.75
17	Qwen3.5-27B· Alibaba Qwen	54.6	$0.20
18	Qwen3.5-122B-A10B· Alibaba Qwen	53.0	$0.26
19	DeepSeek V3.2· DeepSeek	52.9	$0.26
20	Step 3.5 Flash· stepfun	52.0	$0.10
21	Qwen3 Max Thinking· Alibaba Qwen	50.1	$0.78
22	Gemini 3 Flash Preview· Google DeepMind	49.7	$0.50
23	Grok 4.1 Fast· xAI	49.3	$0.20
24	GPT-5.4 Nano· OpenAI	49.3	$0.20
25	MiMo-V2-Flash· xiaomi	48.8	$0.09
26	Gemini 3 Pro· Google DeepMind	45.0	—
27	Qwen3.5-35B-A3B· Alibaba Qwen	44.1	$0.16
28	Trinity Large Thinking· arcee-ai	42.6	$0.22
29	Qwen3 Coder Next· Alibaba Qwen	42.1	$0.15
30	Gemma 4 31B (free)· Google DeepMind	40.9	$0.00
31	Mercury 2· inception	39.7	$0.25
32	gpt-oss-120b (free)· OpenAI	37.9	$0.00
33	Qwen3.5-9B· Alibaba Qwen	37.4	$0.05
34	o3· OpenAI	36.1	$2.00
35	Grok Code Fast 1· xAI	35.6	$0.20
36	Solar Pro 3· upstage	34.9	$0.15
37	Gemini 2.5 Pro· Google DeepMind	32.7	$1.25
38	Qwen3.5 4B· Alibaba	32.5	—
39	Gemma 4 26B A4B (free)· Google DeepMind	32.1	$0.00
40	gpt-oss-20b (free)· OpenAI	27.6	$0.00
41	Gemini 3.1 Flash Lite Preview· Google DeepMind	25.7	$0.25
42	Mistral Medium 3.1· Mistral AI	25.3	$0.40
43	Qwen3 Next 80B A3B Instruct (free)· Alibaba Qwen	23.6	$0.00
44	Mistral Small 4· Mistral AI	23.4	$0.15
45	Qwen3.5 2B· Alibaba	23.0	—
46	R1 0528· DeepSeek	20.8	$0.50
47	INTELLECT-3· prime-intellect	19.8	$0.20
48	Qwen3 Coder 480B A35B (free)· Alibaba Qwen	18.3	$0.00
49	Qwen3.5 0.8B· Alibaba	15.9	—
50	Qwen3 Next 80B A3B Instruct· Alibaba Qwen	14.2	$0.09
51	Gemini 2.5 Flash Lite· Google DeepMind	11.7	$0.10
52	NVIDIA Nemotron Nano 9B V2· NVIDIA	9.4	—
53	Llama 4 Maverick· Meta	7.2	$0.15
54	N Nanbeige4.1 3B· Nanbeige	7.2	—
55	LFM2.5-1.2B-Thinking (free)· liquid	6.5	$0.00
56	Llama 4 Scout· Meta	5.2	$0.08
57	Command A· Cohere	5.1	$2.50
58	Granite 4.0 Micro· ibm-granite	4.2	$0.02
59	Llama 3.1 Nemotron Ultra 253B v1· NVIDIA	3.8	$0.60
60	LFM2-24B-A2B· liquid	3.7	$0.03
61	LFM2.5-1.2B-Instruct (free)· liquid	3.6	$0.00
62	Phi 4 Mini Instruct· Microsoft	2.7	—

Frequently asked

Pulled from the Artificial Analysis · Agentic Index dataset · updated daily

What does Artificial Analysis · Agentic Index measure?

Artificial Analysis · Agentic Index is a knowledge benchmark in the BenchGecko catalog. 62 AI models have been tested on it. Scores range from 2.7 to 69.4 out of 60.

Which model leads on Artificial Analysis · Agentic Index?

GPT-5.4 from OpenAI leads Artificial Analysis · Agentic Index with a score of 69.4. The median score across 62 tested models is 38.8.

Is Artificial Analysis · Agentic Index saturated?

Yes · the top model on Artificial Analysis · Agentic Index has reached 69.4 out of 60, within 5% of the theoretical ceiling. This benchmark is approaching saturation and may be replaced by a harder successor.

Does Artificial Analysis · Agentic Index predict performance on other benchmarks?

Yes · Artificial Analysis · Agentic Index scores correlate 0.96 with GeoBench across 5 shared models. Models that do well on Artificial Analysis · Agentic Index tend to do well on GeoBench.

How often is Artificial Analysis · Agentic Index data refreshed?

BenchGecko pulls updates daily. New model scores on Artificial Analysis · Agentic Index appear as soon as they are published by Epoch AI or the model provider.