LiveBench · Data Analysis
The Frontier
Best score over time · one chart, every benchmark
Distribution
Where models cluster
Correlated benchmarks
Pearson r · original research
Benchmarks that track with LiveBench · Data Analysis
Pearson correlation across models scored on both benchmarks. Closer to 1 = strongly predictive.
Full rankings
29 models tested · sorted by score
| # | Model | Score |
|---|---|---|
| 1 | 78.2 | |
| 2 | 69.9 | |
| 3 | 67.9 | |
| 4 | 63.2 | |
| 5 | 60.8 | |
| 6 | 58.8 | |
| 7 | 56.3 | |
| 8 | 55.2 | |
| 9 | 54.9 | |
| 10 | 54.1 | |
| 11 | 53.6 | |
| 12 | 52.3 | |
| 13 | 52.2 | |
| 14 | 52.0 | |
| 15 | 49.8 | |
| 16 | 49.7 | |
| 17 | 49.6 | |
| 18 | 49.6 | |
| 19 | 49.2 | |
| 20 | 47.4 | |
| 21 | 46.4 | |
| 22 | 45.0 | |
| 23 | 44.7 | |
| 24 | 44.3 | |
| 25 | 44.3 | |
| 26 | 39.1 | |
| 27 | 39.1 | |
| 28 | 38.8 | |
| 29 | 21.2 |
Frequently asked
Pulled from the LiveBench · Data Analysis dataset · updated daily
What does LiveBench · Data Analysis measure?
LiveBench · Data Analysis is a knowledge benchmark in the BenchGecko catalog. 29 AI models have been tested on it. Scores range from 21.2 to 78.2 out of 100.
Which model leads on LiveBench · Data Analysis?
GPT-5.2-Codex from OpenAI leads LiveBench · Data Analysis with a score of 78.2. The median score across 29 tested models is 49.8.
Is LiveBench · Data Analysis saturated?
No · the top score is 78.2 out of 100 (78%). There is still meaningful room for improvement on LiveBench · Data Analysis.
Does LiveBench · Data Analysis predict performance on other benchmarks?
Yes · LiveBench · Data Analysis scores correlate 0.84 with LiveBench · Overall across 29 shared models. Models that do well on LiveBench · Data Analysis tend to do well on LiveBench · Overall.
How often is LiveBench · Data Analysis data refreshed?
BenchGecko pulls updates daily. New model scores on LiveBench · Data Analysis appear as soon as they are published by Epoch AI or the model provider.
Top on LiveBench · Data Analysis
GPT-5.2-Codex · 78.2Qwen3.6 Plus · 69.9GLM 5 · 67.9GLM 5.1 · 63.2GPT-5.1-Codex · 60.8More knowledge benchmarks
Same category · related evaluations