ベータ
ランキング/Claude Sonnet 4
Anthropic logo

Claude Sonnet 4

開発元 Anthropic · リリース日 2025-05-22

44.6
平均スコア
$3.00/1M
入力料金
$15.00/1M
出力料金
1.0M tokens (~500 books)
コンテキストウィンドウ
multimodal
タイプ

Tested on 27 benchmarks with 44.6% average. Top scores: MASK (95.3%), OpenCompass — IFEval (88.3%), MATH level 5 (84.4%).

ベンチマークカテゴリスコアBar
MASKsafety95.3
OpenCompass — IFEvallanguage88.3
MATH level 5math84.4
OpenCompass — MMLU-Proknowledge83.0
OpenCompass — GPQA-Diamondknowledge74.6
GPQA diamondknowledge72.3
OTIS Mock AIME 2024-2025math71.1
OpenCompass — AIME2025math68.7
SWE-Bench Verified (Bash Only)coding64.9
Aider polyglotcoding61.3
DeepResearch Benchknowledge47.8
OpenCompass — LiveCodeBenchV6coding47.5
Fiction.LiveBenchknowledge46.9
WeirdMLcoding46.1
OSWorldagentic43.9
ARC-AGIreasoning40.0
GeoBenchknowledge37.0
Cybenchcoding35.0
SimpleBenchreasoning34.6
The Agent Companyagentic33.1
OpenCompass — HLEknowledge8.7
ARC-AGI-2reasoning5.9
GSO-Benchcoding4.9
FrontierMath-2025-02-28-Privatemath4.1
HLEknowledge3.1
VPCTknowledge1.0
FrontierMath-Tier-4-2025-07-01-Privatemath0.1