Falcon-180B
Open Sourcedi TII · Rilascio 2024-01-01
44.4
punteggio medio
N/A
Prezzo Input
N/A
Prezzo Output
N/A
Finestra di Contesto
text
Tipo
Tested on 17 benchmarks with 44.4% average. Top scores: HellaSwag (85.3%), TriviaQA (79.9%), LAMBADA (79.8%).
Punteggi Benchmark
| Benchmark | Categoria | Punteggio | Bar |
|---|---|---|---|
| HellaSwag | knowledge | 85.3 | |
| TriviaQA | knowledge | 79.9 | |
| LAMBADA | knowledge | 79.8 | |
| Winogrande | knowledge | 74.2 | |
| PIQA | knowledge | 69.8 | |
| MMLU | knowledge | 60.8 | |
| ARC AI2 | knowledge | 57.1 | |
| GSM8K | math | 54.4 | |
| OpenBookQA | knowledge | 52.3 | |
| CMMLU | knowledge | 41.5 | |
| IFEval | language | 32.6 | |
| BBH (HuggingFace) | general | 21.9 | |
| BBH | reasoning | 16.1 | |
| MMLU-PRO | knowledge | 15.4 | |
| MUSR | reasoning | 7.5 | |
| GPQA | knowledge | 2.8 | |
| MATH Level 5 | math | 2.8 |