Falcon-180B
Open Sourcevon TII · Veroeffentlicht 2024-01-01
44.4
Durchschn. Score
N/A
Eingabepreis
N/A
Ausgabepreis
N/A
Kontextfenster
text
Typ
Tested on 17 benchmarks with 44.4% average. Top scores: HellaSwag (85.3%), TriviaQA (79.9%), LAMBADA (79.8%).
Benchmark-Ergebnisse
| Benchmark | Kategorie | Score | Bar |
|---|---|---|---|
| HellaSwag | knowledge | 85.3 | |
| TriviaQA | knowledge | 79.9 | |
| LAMBADA | knowledge | 79.8 | |
| Winogrande | knowledge | 74.2 | |
| PIQA | knowledge | 69.8 | |
| MMLU | knowledge | 60.8 | |
| ARC AI2 | knowledge | 57.1 | |
| GSM8K | math | 54.4 | |
| OpenBookQA | knowledge | 52.3 | |
| CMMLU | knowledge | 41.5 | |
| IFEval | language | 32.6 | |
| BBH (HuggingFace) | general | 21.9 | |
| BBH | reasoning | 16.1 | |
| MMLU-PRO | knowledge | 15.4 | |
| MUSR | reasoning | 7.5 | |
| GPQA | knowledge | 2.8 | |
| MATH Level 5 | math | 2.8 |