FrontierMath-2025-02-28-Private
FrontierMath (Feb 2025) β original research-level math problems created by mathematicians, testing capabilities at the boundary of current AI mathematical reasoning.
60
Models Tested
50.0
Top Score
14.1
Average Score