GPT-5 Nano
来自 OpenAI · 发布于 2025-08-07
45.3
平均分
$0.05/1M
输入价格
$0.40/1M
输出价格
400K tokens (~200 books)
上下文窗口
multimodal
类型
Tested on 26 benchmarks with 45.3% average. Top scores: MATH level 5 (95.2%), HELM — IFEval (93.2%), OTIS Mock AIME 2024-2025 (81.1%).
基准测试分数
| 基准测试 | 类别 | 分数 | Bar |
|---|---|---|---|
| MATH level 5 | math | 95.2 | |
| HELM — IFEval | language | 93.2 | |
| OTIS Mock AIME 2024-2025 | math | 81.1 | |
| HELM — WildBench | reasoning | 80.6 | |
| HELM — MMLU-Pro | knowledge | 77.8 | |
| HELM — GPQA | knowledge | 67.9 | |
| LiveBench — Coding | coding | 67.4 | |
| LiveBench — Mathematics | math | 64.7 | |
| GPQA diamond | knowledge | 59.3 | |
| HELM — Omni-MATH | math | 54.7 | |
| LiveBench — If | language | 52.0 | |
| LiveBench — Overall | knowledge | 48.6 | |
| LiveBench — Language | language | 47.7 | |
| Fiction.LiveBench | knowledge | 44.4 | |
| LiveBench — Data Analysis | reasoning | 44.3 | |
| WeirdML | coding | 38.1 | |
| LiveBench — Reasoning | reasoning | 35.5 | |
| SWE-Bench Verified (Bash Only) | coding | 34.8 | |
| LiveBench — Agentic Coding | coding | 28.3 | |
| ARC-AGI | reasoning | 20.7 | |
| SimpleQA Verified | knowledge | 12.2 | |
| Terminal Bench | coding | 11.5 | |
| FrontierMath-2025-02-28-Private | math | 8.3 | |
| VPCT | knowledge | 5.8 | |
| ARC-AGI-2 | reasoning | 2.6 | |
| FrontierMath-Tier-4-2025-07-01-Private | math | 2.1 |
相似模型
Meta
45.2
Anthropic
45.4
DeepSeek
45.1
Mistral AI
45.8