Benchmark · KnowledgeSettled

LAMBADA

LAMBADA · measures the ability to predict the final word of a passage, requiring broad contextual understanding across long text spans.

Updated 2026-05-15
Models tested
7
Top score
79.8
Falcon-180B
Median
73.3
min 70.0
Top-5 spread
σ 2.9
Competitive

Best score over time · one chart, every benchmark

LAMBADA0 MODELS · FRONTIER RUNNING MAX0255075100SCORE ↑May 24Nov 24May 25Nov 25May 26RELEASE DATE →benchgecko.ai/benchmark/lambada · frontier
Only 0 models have been tested on LAMBADA · not enough history to compute a frontier yet.
Pink dots = frontier records · 0 totalClick to open model page

7 models tested · sorted by score

Same category · related evaluations