API
Benchmarks/Winogrande

Winogrande

WinoGrande β€” large-scale commonsense reasoning benchmark where models must resolve ambiguous pronouns in carefully constructed sentence pairs.

47
Models Tested
78.4
Top Score
45.2
Average Score