Beta
Benchmark · Knowledge

LiveBench · Coding

Updated 2026-04-07
Models tested
29
Top score
83.6
GPT-5.2-Codex
Median
69.9
min 54.1
Top-5 spread
σ 3.1
competitive

Best score over time · one chart, every benchmark

LIVEBENCH · CODING29 MODELS · FRONTIER RUNNING MAX0255075100SCORE ↑Jul 25Sep 25Nov 25Feb 26Apr 26RELEASE DATE →benchgecko.ai/benchmark/livebench-coding · frontier
Frontier on LiveBench · Coding rose from 69.6 to 83.6 in 6 months · +14.0 points · latest leader GPT-5.2-Codex from OpenAI.
Pink dots = frontier records · 4 totalClick to open model page

Where models cluster

SCORE DISTRIBUTION0–1010–2020–3030–4040–50250–601360–701270–80280–9090–100MEDIAN · 69.9SCORE BUCKET → (0 TO 100)MODELSbenchgecko.ai

Pearson r · original research

29 models tested · sorted by score

Pulled from the LiveBench · Coding dataset · updated daily

What does LiveBench · Coding measure?

LiveBench · Coding is a knowledge benchmark in the BenchGecko catalog. 29 AI models have been tested on it. Scores range from 54.1 to 83.6 out of 100.

Which model leads on LiveBench · Coding?

GPT-5.2-Codex from OpenAI leads LiveBench · Coding with a score of 83.6. The median score across 29 tested models is 69.9.

Is LiveBench · Coding saturated?

No · the top score is 83.6 out of 100 (84%). There is still meaningful room for improvement on LiveBench · Coding.

Does LiveBench · Coding predict performance on other benchmarks?

Yes · LiveBench · Coding scores correlate 0.74 with Chatbot Arena Elo · Overall across 15 shared models. Models that do well on LiveBench · Coding tend to do well on Chatbot Arena Elo · Overall.

How often is LiveBench · Coding data refreshed?

BenchGecko pulls updates daily. New model scores on LiveBench · Coding appear as soon as they are published by Epoch AI or the model provider.

Same category · related evaluations