#	Model	Score	Price
1	GLM 5· z-ai	86.2	$0.72
2	Step 3.5 Flash· stepfun	83.9	$0.10
3	GLM 4.7· z-ai	83.8	$0.39
4	Qwen3.5 397B A17B· Alibaba Qwen	83.0	$0.39
5	DeepSeek V3.2 Speciale· DeepSeek	80.9	$0.40
6	Kimi K2.5· moonshotai	80.6	$0.38
7	gpt-oss-120b (free)· OpenAI	78.4	$0.00
8	GLM 4.6· z-ai	78.2	$0.39
9	Kimi K2 Thinking· moonshotai	77.1	$0.60
10	DeepSeek V3.2· DeepSeek	75.4	$0.26
11	MiniMax M2· minimax	74.0	$0.26
12	MiniMax M2.5· minimax	73.6	$0.12
13	MiMo-V2-Flash· xiaomi	71.8	$0.09
14	Gemini 2.5 Pro· Google DeepMind	71.3	$1.25
15	Qwen3 235B A22B Thinking 2507· Alibaba Qwen	70.6	$0.15
16	gpt-oss-20b (free)· OpenAI	68.4	$0.00
17	Qwen3 Next 80B A3B Thinking· Alibaba Qwen	66.3	$0.10
18	GLM 4.5· z-ai	65.0	$0.60
19	R1 0528· DeepSeek	61.0	$0.50
20	Qwen3 30B A3B Thinking 2507· Alibaba Qwen	59.4	$0.08
21	Qwen3 32B· Alibaba Qwen	57.6	$0.08
22	Qwen3 Next 80B A3B Instruct· Alibaba Qwen	54.8	$0.09
23	Qwen3 4B Thinking 2507· Alibaba	51.6	—
24	Qwen3 8B· Alibaba Qwen	50.1	$0.05
25	LongCat Flash Chat· meituan	48.2	$0.20
26	Claude Sonnet 4· Anthropic	47.5	$3.00
27	ERNIE 4.5 21B A3B Thinking· baidu	47.0	$0.07
28	Hunyuan A13B Instruct· tencent	43.9	$0.14
29	Qwen3 235B A22B Instruct 2507· Alibaba Qwen	43.0	$0.07
30	Qwen3 30B A3B Instruct 2507· Alibaba Qwen	38.7	$0.09
31	Qwen3 4B Instruct 2507· Alibaba	33.5	—
32	Gemma 3 27B· Google DeepMind	30.8	$0.08

Frequently asked

Pulled from the OpenCompass · LiveCodeBenchV6 dataset · updated daily

What does OpenCompass · LiveCodeBenchV6 measure?

OpenCompass · LiveCodeBenchV6 is a knowledge benchmark in the BenchGecko catalog. 32 AI models have been tested on it. Scores range from 30.8 to 86.2 out of 100.

Which model leads on OpenCompass · LiveCodeBenchV6?

GLM 5 from z-ai leads OpenCompass · LiveCodeBenchV6 with a score of 86.2. The median score across 32 tested models is 67.3.

Is OpenCompass · LiveCodeBenchV6 saturated?

No · the top score is 86.2 out of 100 (86%). There is still meaningful room for improvement on OpenCompass · LiveCodeBenchV6.

Does OpenCompass · LiveCodeBenchV6 predict performance on other benchmarks?

Yes · OpenCompass · LiveCodeBenchV6 scores correlate 0.96 with Fiction.LiveBench across 6 shared models. Models that do well on OpenCompass · LiveCodeBenchV6 tend to do well on Fiction.LiveBench.

How often is OpenCompass · LiveCodeBenchV6 data refreshed?

BenchGecko pulls updates daily. New model scores on OpenCompass · LiveCodeBenchV6 appear as soon as they are published by Epoch AI or the model provider.