LLaMA-13B
开源来自 Meta · 发布于 2024-01-01
34.9
平均分
N/A
输入价格
N/A
输出价格
N/A
上下文窗口
text
类型
Tested on 20 benchmarks with 34.9% average. Top scores: Chatbot Arena Elo — Overall (970.9%), TriviaQA (77.9%), LAMBADA (75.2%).
基准测试分数
| 基准测试 | 类别 | 分数 | Bar |
|---|---|---|---|
| Chatbot Arena Elo — Overall | arena | 970.9 | |
| TriviaQA | knowledge | 77.9 | |
| LAMBADA | knowledge | 75.2 | |
| HellaSwag | knowledge | 72.3 | |
| PIQA | knowledge | 60.2 | |
| Winogrande | knowledge | 46.0 | |
| OpenBookQA | knowledge | 41.9 | |
| CMMLU | knowledge | 39.8 | |
| C-Eval | knowledge | 38.8 | |
| ARC AI2 | knowledge | 36.9 | |
| MMLU | knowledge | 30.3 | |
| IFEval | language | 25.3 | |
| BBH (HuggingFace) | general | 25.3 | |
| ScienceQA | knowledge | 24.4 | |
| MMLU-PRO | knowledge | 23.1 | |
| GSM8K | math | 20.6 | |
| BBH | reasoning | 17.2 | |
| GPQA | knowledge | 3.5 | |
| MATH Level 5 | math | 3.1 | |
| MUSR | reasoning | 2.0 |