Classement/Claude 3.7 Sonnet (thinking)

Claude 3.7 Sonnet (thinking)

par Anthropic · Sorti le 2025-02-24

42.1

score moyen

$3.00/1M

Prix d'entrée

$15.00/1M

Prix de sortie

200K tokens (~100 books)

Fenêtre de contexte

multimodal

Type

Tested on 20 benchmarks with 42.1% average. Top scores: MATH level 5 (91.2%), Fiction.LiveBench (83.3%), Lech Mazur Writing (81.1%).

Scores de benchmark

Benchmark	Catégorie	Score
MATH level 5	math	91.2
Fiction.LiveBench	knowledge	83.3
Lech Mazur Writing	knowledge	81.1
GPQA diamond	knowledge	73.0
GeoBench	knowledge	68.0
Aider polyglot	coding	64.9
OTIS Mock AIME 2024-2025	math	57.7
CadEval	coding	54.0
SWE-Bench Verified (Bash Only)	coding	52.8
DeepResearch Bench	knowledge	43.6
OSWorld	agentic	35.8
SimpleBench	reasoning	35.7
The Agent Company	agentic	30.9
ARC-AGI	reasoning	28.6
Cybench	coding	20.0
VPCT	knowledge	8.5
FrontierMath-2025-02-28-Private	math	4.1
GSO-Bench	coding	3.8
HLE	knowledge	3.4
ARC-AGI-2	reasoning	0.9