API
Benchmarks/ARC AI2

ARC AI2

AI2 Reasoning Challenge β€” tests grade-school level science knowledge with multiple-choice questions requiring reasoning beyond simple retrieval.

48
Models Tested
93.7
Top Score
48.8
Average Score