Benchmark · KnowledgeSettled

DeepResearch Bench

DeepResearch Bench · evaluates AI on complex multi-step research tasks requiring information gathering, synthesis, and producing comprehensive analyses.

Updated 2025-09-29
Models tested
13
Top score
55.1
GPT-5
Median
47.8
min 29.2
Top-5 spread
σ 2.3
Competitive

Best score over time · one chart, every benchmark

DEEPRESEARCH BENCH13 MODELS · FRONTIER RUNNING MAX0255075100SCORE ↑Jan 25Mar 25May 25Jul 25Sep 25RELEASE DATE →benchgecko.ai/benchmark/deepresearch-bench · frontier
Frontier on DeepResearch Bench rose from 35.1 to 55.1 in 7 months · +20.0 points · latest leader GPT-5 from OpenAI.
Pink dots = frontier records · 6 totalClick to open model page

13 models tested · sorted by score

Same category · related evaluations