Grok 3 Beta
来自 xAI · 发布于 2025-04-09
69.5
平均分
$3.00/1M
输入价格
$15.00/1M
输出价格
131K tokens (~66 books)
上下文窗口
text
类型
Tested on 6 benchmarks with 69.5% average. Top scores: HELM — IFEval (88.4%), HELM — WildBench (84.9%), HELM — MMLU-Pro (78.8%).
基准测试分数
| 基准测试 | 类别 | 分数 | Bar |
|---|---|---|---|
| HELM — IFEval | language | 88.4 | |
| HELM — WildBench | reasoning | 84.9 | |
| HELM — MMLU-Pro | knowledge | 78.8 | |
| HELM — GPQA | knowledge | 65.0 | |
| Aider polyglot | coding | 53.3 | |
| HELM — Omni-MATH | math | 46.4 |