Benchmark · KnowledgeSettled

OpenBookQA

OpenBookQA · science questions that require combining a given core fact with broad common knowledge, mimicking an open-book exam setting.

Updated 2024-07-16
Models tested
19
Top score
84.0
phi-3-mini 3.8B
Median
52.3
min 14.4
Top-5 spread
σ 1.3
Settled

Best score over time · one chart, every benchmark

OPENBOOKQA1 MODELS · FRONTIER RUNNING MAX0255075100SCORE ↑Jul 24Jul 24Jul 24Jul 24Jul 24RELEASE DATE →benchgecko.ai/benchmark/openbookqa · frontier
Only 1 models have been tested on OpenBookQA · not enough history to compute a frontier yet.
Pink dots = frontier records · 0 totalClick to open model page

Same category · related evaluations