DeepSeek V3.2
オープンソース開発元 DeepSeek · リリース日 2025-12-01
53.0
平均スコア
$0.25/1M
入力料金
$0.38/1M
出力料金
131K tokens (~66 books)
コンテキストウィンドウ
text
タイプ
Tested on 29 benchmarks with 53.0% average. Top scores: Chatbot Arena Elo — Overall (1424.4%), Chatbot Arena Elo — Coding (1326.9%), OpenCompass — AIME2025 (93.0%).
ベンチマークスコア
| ベンチマーク | カテゴリ | スコア | Bar |
|---|---|---|---|
| Chatbot Arena Elo — Overall | arena | 1424.4 | |
| Chatbot Arena Elo — Coding | arena | 1326.9 | |
| OpenCompass — AIME2025 | math | 93.0 | |
| OpenCompass — IFEval | language | 89.7 | |
| OTIS Mock AIME 2024-2025 | math | 87.8 | |
| OpenCompass — MMLU-Pro | knowledge | 85.8 | |
| OpenCompass — GPQA-Diamond | knowledge | 84.6 | |
| GPQA diamond | knowledge | 77.9 | |
| LiveBench — Coding | coding | 75.7 | |
| OpenCompass — LiveCodeBenchV6 | coding | 75.4 | |
| Aider polyglot | coding | 74.2 | |
| LiveBench — Language | language | 64.2 | |
| LiveBench — Mathematics | math | 64.0 | |
| ARC-AGI | reasoning | 57.0 | |
| Artificial Analysis — Agentic Index | speed | 52.9 | |
| LiveBench — Overall | knowledge | 51.8 | |
| LiveBench — Agentic Coding | coding | 46.7 | |
| LiveBench — Data Analysis | reasoning | 45.0 | |
| LiveBench — Reasoning | reasoning | 44.3 | |
| Artificial Analysis — Quality Index | speed | 41.7 | |
| Terminal Bench | coding | 39.6 | |
| Artificial Analysis — Coding Index | speed | 36.7 | |
| SimpleQA Verified | knowledge | 27.5 | |
| OpenCompass — HLE | knowledge | 23.2 | |
| LiveBench — If | language | 23.1 | |
| FrontierMath-2025-02-28-Private | math | 22.1 | |
| Chess Puzzles | knowledge | 14.0 | |
| ARC-AGI-2 | reasoning | 4.0 | |
| FrontierMath-Tier-4-2025-07-01-Private | math | 2.1 |