Benchmark · KnowledgeSettled

LiveBench · Agentic Coding

Updated 2026-04-07
Models tested
29
Top score
56.7
GPT-5.1-Codex-Max
Median
36.7
min 3.3
Top-5 spread
σ 1.1
Settled

Best score over time · one chart, every benchmark

LIVEBENCH · AGENTIC CODING29 MODELS · FRONTIER RUNNING MAX0255075100SCORE ↑Jul 25Sep 25Nov 25Feb 26Apr 26RELEASE DATE →benchgecko.ai/benchmark/livebench-agentic-coding · frontier
Frontier on LiveBench · Agentic Coding rose from 13.3 to 56.7 in 5 months · +43.3 points · latest leader GPT-5.1-Codex-Max from OpenAI.
Pink dots = frontier records · 7 totalClick to open model page

Same category · related evaluations