베타
리더보드/Claude Sonnet 4
Anthropic logo

Claude Sonnet 4

제공 Anthropic · 출시일 2025-05-22

44.6
평균 점수
$3.00/1M
입력 가격
$15.00/1M
출력 가격
1.0M tokens (~500 books)
컨텍스트 윈도우
multimodal
유형

Tested on 27 benchmarks with 44.6% average. Top scores: MASK (95.3%), OpenCompass — IFEval (88.3%), MATH level 5 (84.4%).

벤치마크카테고리점수Bar
MASKsafety95.3
OpenCompass — IFEvallanguage88.3
MATH level 5math84.4
OpenCompass — MMLU-Proknowledge83.0
OpenCompass — GPQA-Diamondknowledge74.6
GPQA diamondknowledge72.3
OTIS Mock AIME 2024-2025math71.1
OpenCompass — AIME2025math68.7
SWE-Bench Verified (Bash Only)coding64.9
Aider polyglotcoding61.3
DeepResearch Benchknowledge47.8
OpenCompass — LiveCodeBenchV6coding47.5
Fiction.LiveBenchknowledge46.9
WeirdMLcoding46.1
OSWorldagentic43.9
ARC-AGIreasoning40.0
GeoBenchknowledge37.0
Cybenchcoding35.0
SimpleBenchreasoning34.6
The Agent Companyagentic33.1
OpenCompass — HLEknowledge8.7
ARC-AGI-2reasoning5.9
GSO-Benchcoding4.9
FrontierMath-2025-02-28-Privatemath4.1
HLEknowledge3.1
VPCTknowledge1.0
FrontierMath-Tier-4-2025-07-01-Privatemath0.1