Beta
Benchmark · Knowledge

LiveBench · Language

Updated 2026-04-07
Models tested
29
Top score
77.5
GLM 5
Median
65.6
min 28.7
Top-5 spread
σ 1.9
settled

Best score over time · one chart, every benchmark

LIVEBENCH · LANGUAGE29 MODELS · FRONTIER RUNNING MAX0255075100SCORE ↑Jul 25Sep 25Nov 25Feb 26Apr 26RELEASE DATE →benchgecko.ai/benchmark/livebench-language · frontier
Frontier on LiveBench · Language rose from 66.1 to 77.5 in 7 months · +11.5 points · latest leader GLM 5 from z-ai.
Pink dots = frontier records · 4 totalClick to open model page

Where models cluster

SCORE DISTRIBUTION0–1010–20120–30130–40540–50350–601360–70670–8080–9090–100MEDIAN · 65.6SCORE BUCKET → (0 TO 100)MODELSbenchgecko.ai

Pearson r · original research

29 models tested · sorted by score

Pulled from the LiveBench · Language dataset · updated daily

What does LiveBench · Language measure?

LiveBench · Language is a knowledge benchmark in the BenchGecko catalog. 29 AI models have been tested on it. Scores range from 28.7 to 77.5 out of 100.

Which model leads on LiveBench · Language?

GLM 5 from z-ai leads LiveBench · Language with a score of 77.5. The median score across 29 tested models is 65.6.

Is LiveBench · Language saturated?

No · the top score is 77.5 out of 100 (78%). There is still meaningful room for improvement on LiveBench · Language.

Does LiveBench · Language predict performance on other benchmarks?

Yes · LiveBench · Language scores correlate 0.88 with LiveBench · Mathematics across 29 shared models. Models that do well on LiveBench · Language tend to do well on LiveBench · Mathematics.

How often is LiveBench · Language data refreshed?

BenchGecko pulls updates daily. New model scores on LiveBench · Language appear as soon as they are published by Epoch AI or the model provider.

Same category · related evaluations