DeepResearch Bench
DeepResearch Bench β evaluates AI on complex multi-step research tasks requiring information gathering, synthesis, and producing comprehensive analyses.
12
Models Tested
52.6
Top Score
46.7
Average Score
Rankings
| # | Model | Score | Bar |
|---|---|---|---|
| 1 | 52.6 | ||
| 2 | 51.0 | ||
| 3 | 51.0 | ||
| 4 | 49.7 | ||
| 5 | 49.0 | ||
| 6 | 47.9 | ||
| 7 | 47.8 | ||
| 8 | 46.6 | ||
| 9 | 43.6 | ||
| 10 | 43.6 | ||
| 11 | 42.8 | ||
| 12 | 35.1 |