Benchmark · KnowledgeSettled

PostTrainBench

Updated 2026-03-05
Models tested
15
Top score
23.2
Claude Opus 4.6
Median
16.4
min 7.3
Top-5 spread
σ 1.7
Settled

Best score over time · one chart, every benchmark

POSTTRAINBENCH14 MODELS · FRONTIER RUNNING MAX0255075100SCORE ↑Sep 25Nov 25Dec 25Jan 26Mar 26RELEASE DATE →benchgecko.ai/benchmark/posttrainbench · frontier
Frontier on PostTrainBench rose from 7.4 to 23.2 in 4 months · +15.7 points · latest leader Claude Opus 4.6 from Anthropic.
Pink dots = frontier records · 5 totalClick to open model page

Same category · related evaluations