Clasificación/Claude 3.7 Sonnet (thinking)

Claude 3.7 Sonnet (thinking)

por Anthropic · Publicado el 2025-02-24

42.1

puntuación promedio

$3.00/1M

Precio de entrada

$15.00/1M

Precio de salida

200K tokens (~100 books)

Ventana de contexto

multimodal

Tipo

Tested on 20 benchmarks with 42.1% average. Top scores: MATH level 5 (91.2%), Fiction.LiveBench (83.3%), Lech Mazur Writing (81.1%).

Puntuaciones de benchmark

Benchmark	Categoría	Puntuación
MATH level 5	math	91.2
Fiction.LiveBench	knowledge	83.3
Lech Mazur Writing	knowledge	81.1
GPQA diamond	knowledge	73.0
GeoBench	knowledge	68.0
Aider polyglot	coding	64.9
OTIS Mock AIME 2024-2025	math	57.7
CadEval	coding	54.0
SWE-Bench Verified (Bash Only)	coding	52.8
DeepResearch Bench	knowledge	43.6
OSWorld	agentic	35.8
SimpleBench	reasoning	35.7
The Agent Company	agentic	30.9
ARC-AGI	reasoning	28.6
Cybench	coding	20.0
VPCT	knowledge	8.5
FrontierMath-2025-02-28-Private	math	4.1
GSO-Bench	coding	3.8
HLE	knowledge	3.4
ARC-AGI-2	reasoning	0.9