Benchmark · KnowledgeSettled

Fiction.LiveBench

Fiction.LiveBench · a continuously updated benchmark using recently published fiction to test reading comprehension and reasoning, preventing data contamination.

Updated 2026-01-27
Models tested
41
Top score
97.2
GPT-5
Median
61.1
min 25.0
Top-5 spread
σ 2.1
Competitive

Best score over time · one chart, every benchmark

FICTION.LIVEBENCH37 MODELS · FRONTIER RUNNING MAX0255075100SCORE ↑Dec 24Mar 25Jul 25Oct 25Jan 26RELEASE DATE →benchgecko.ai/benchmark/fiction-livebench · frontier
Frontier on Fiction.LiveBench rose from 33.3 to 97.2 in 6 months · +63.9 points · latest leader o3 Pro from OpenAI.
Pink dots = frontier records · 4 totalClick to open model page

Same category · related evaluations