OpenCompass · LiveCodeBenchV6
The Frontier
Best score over time · one chart, every benchmark
Distribution
Where models cluster
Correlated benchmarks
Pearson r · original research
Benchmarks that track with OpenCompass · LiveCodeBenchV6
Pearson correlation across models scored on both benchmarks. Closer to 1 = strongly predictive.
Full rankings
32 models tested · sorted by score
| # | Model | Score |
|---|---|---|
| 1 | 86.2 | |
| 2 | 83.9 | |
| 3 | 83.8 | |
| 4 | 83.0 | |
| 5 | 80.9 | |
| 6 | 80.6 | |
| 7 | 78.4 | |
| 8 | 78.2 | |
| 9 | 77.1 | |
| 10 | 75.4 | |
| 11 | 74.0 | |
| 12 | 73.6 | |
| 13 | 71.8 | |
| 14 | 71.3 | |
| 15 | 70.6 | |
| 16 | 68.4 | |
| 17 | 66.3 | |
| 18 | 65.0 | |
| 19 | 61.0 | |
| 20 | 59.4 | |
| 21 | 57.6 | |
| 22 | 54.8 | |
| 23 | 51.6 | |
| 24 | 50.1 | |
| 25 | 48.2 | |
| 26 | 47.5 | |
| 27 | 47.0 | |
| 28 | 43.9 | |
| 29 | 43.0 | |
| 30 | 38.7 | |
| 31 | 33.5 | |
| 32 | 30.8 |
Frequently asked
Pulled from the OpenCompass · LiveCodeBenchV6 dataset · updated daily
What does OpenCompass · LiveCodeBenchV6 measure?
OpenCompass · LiveCodeBenchV6 is a knowledge benchmark in the BenchGecko catalog. 32 AI models have been tested on it. Scores range from 30.8 to 86.2 out of 100.
Which model leads on OpenCompass · LiveCodeBenchV6?
GLM 5 from z-ai leads OpenCompass · LiveCodeBenchV6 with a score of 86.2. The median score across 32 tested models is 67.3.
Is OpenCompass · LiveCodeBenchV6 saturated?
No · the top score is 86.2 out of 100 (86%). There is still meaningful room for improvement on OpenCompass · LiveCodeBenchV6.
Does OpenCompass · LiveCodeBenchV6 predict performance on other benchmarks?
Yes · OpenCompass · LiveCodeBenchV6 scores correlate 0.96 with Fiction.LiveBench across 6 shared models. Models that do well on OpenCompass · LiveCodeBenchV6 tend to do well on Fiction.LiveBench.
How often is OpenCompass · LiveCodeBenchV6 data refreshed?
BenchGecko pulls updates daily. New model scores on OpenCompass · LiveCodeBenchV6 appear as soon as they are published by Epoch AI or the model provider.
Top on OpenCompass · LiveCodeBenchV6
GLM 5 · 86.2Step 3.5 Flash · 83.9GLM 4.7 · 83.8Qwen3.5 397B A17B · 83.0DeepSeek V3.2 Speciale · 80.9More knowledge benchmarks
Same category · related evaluations