Beta
Benchmark · Knowledge

LiveBench · Overall

Updated 2026-04-07
Models tested
29
Top score
74.3
GPT-5.2-Codex
Median
55.2
min 32.4
Top-5 spread
σ 1.8
settled

Best score over time · one chart, every benchmark

LIVEBENCH · OVERALL29 MODELS · FRONTIER RUNNING MAX0255075100SCORE ↑Jul 25Sep 25Nov 25Feb 26Apr 26RELEASE DATE →benchgecko.ai/benchmark/livebench-overall · frontier
Frontier on LiveBench · Overall rose from 48.8 to 74.3 in 6 months · +25.5 points · latest leader GPT-5.2-Codex from OpenAI.
Pink dots = frontier records · 7 totalClick to open model page

Where models cluster

SCORE DISTRIBUTION0–1010–2020–30330–40840–50650–60860–70470–8080–9090–100MEDIAN · 55.2SCORE BUCKET → (0 TO 100)MODELSbenchgecko.ai

Pearson r · original research

29 models tested · sorted by score

Pulled from the LiveBench · Overall dataset · updated daily

What does LiveBench · Overall measure?

LiveBench · Overall is a knowledge benchmark in the BenchGecko catalog. 29 AI models have been tested on it. Scores range from 32.4 to 74.3 out of 100.

Which model leads on LiveBench · Overall?

GPT-5.2-Codex from OpenAI leads LiveBench · Overall with a score of 74.3. The median score across 29 tested models is 55.2.

Is LiveBench · Overall saturated?

No · the top score is 74.3 out of 100 (74%). There is still meaningful room for improvement on LiveBench · Overall.

Does LiveBench · Overall predict performance on other benchmarks?

Yes · LiveBench · Overall scores correlate 0.92 with LiveBench · Reasoning across 29 shared models. Models that do well on LiveBench · Overall tend to do well on LiveBench · Reasoning.

How often is LiveBench · Overall data refreshed?

BenchGecko pulls updates daily. New model scores on LiveBench · Overall appear as soon as they are published by Epoch AI or the model provider.

Same category · related evaluations