DeepSeek-R1 (May 2025)
par DeepSeek · Sorti le 2024-01-01
48.5
score moyen
N/A
Prix d'entrée
N/A
Prix de sortie
N/A
Fenêtre de contexte
text
Type
Tested on 11 benchmarks with 48.5% average. Top scores: MATH level 5 (96.6%), Fiction.LiveBench (75.0%), Aider polyglot (71.4%).
Scores de benchmark
| Benchmark | Catégorie | Score | Bar |
|---|---|---|---|
| MATH level 5 | math | 96.6 | |
| Fiction.LiveBench | knowledge | 75.0 | |
| Aider polyglot | coding | 71.4 | |
| GPQA diamond | knowledge | 68.4 | |
| OTIS Mock AIME 2024-2025 | math | 66.4 | |
| WeirdML | coding | 41.6 | |
| DeepResearch Bench | knowledge | 35.1 | |
| SimpleBench | reasoning | 29.0 | |
| SimpleQA Verified | knowledge | 27.4 | |
| ARC-AGI | reasoning | 21.2 | |
| ARC-AGI-2 | reasoning | 1.1 |
Modèles similaires
U
Baichuan2-13Bunknown
48.4
Alibaba Qwen
48.0
xAI
47.8
Meta
49.3