测试版
排行榜/Claude Sonnet 4.5
Anthropic logo

Claude Sonnet 4.5

来自 Anthropic · 发布于 2025-09-29

42.1
平均分
$3.00/1M
输入价格
$15.00/1M
输出价格
1.0M tokens (~500 books)
上下文窗口
multimodal
类型

Tested on 21 benchmarks with 42.1% average. Top scores: MATH level 5 (97.7%), OTIS Mock AIME 2024-2025 (77.8%), GPQA diamond (76.4%).

基准测试类别分数Bar
MATH level 5math97.7
OTIS Mock AIME 2024-2025math77.8
GPQA diamondknowledge76.4
SWE-Bench verifiedcoding71.3
SWE-Bench Verified (Bash Only)coding70.6
ARC-AGIreasoning63.7
OSWorldagentic62.9
Cybenchcoding60.0
DeepResearch Benchknowledge52.6
WeirdMLcoding47.7
Terminal Benchcoding46.5
SimpleBenchreasoning45.2
SimpleQA Verifiedknowledge23.6
FrontierMath-2025-02-28-Privatemath15.2
GSO-Benchcoding14.7
ARC-AGI-2reasoning13.6
Chess Puzzlesknowledge12.0
PostTrainBenchknowledge9.9
VPCTknowledge9.7
HLEknowledge9.4
FrontierMath-Tier-4-2025-07-01-Privatemath4.2