ANLI
ANLI (Adversarial NLI) β adversarially constructed natural language inference dataset where each round targets weaknesses found in previous model generations.
8
Models Tested
37.1
Top Score
28.3
Average Score
Rankings
| # | Model | Score | Bar |
|---|---|---|---|
| 1 | 37.1 | ||
| 2 | 36.0 | ||
| 3 | 33.7 | ||
| 4 | 32.8 | ||
| 5 | 29.2 | ||
| 6 | 23.1 | ||
| 7 | 20.6 | ||
| 8 | 13.8 |