Benchmark · KnowledgeSettled

TriviaQA

TriviaQA · reading comprehension benchmark with trivia questions, requiring models to find and reason over evidence from provided documents.

Updated 2024-12-26
Models tested
20
Top score
87.5
Claude 2
Median
78.4
min 45.2
Top-5 spread
σ 1.8
Settled

Best score over time · one chart, every benchmark

TRIVIAQA3 MODELS · FRONTIER RUNNING MAX0255075100SCORE ↑Jul 24Aug 24Oct 24Nov 24Dec 24RELEASE DATE →benchgecko.ai/benchmark/triviaqa · frontier
Only 3 models have been tested on TriviaQA · not enough history to compute a frontier yet.
Pink dots = frontier records · 0 totalClick to open model page

Same category · related evaluations