Home/Models/Claude 3 Sonnet
Anthropic logo

Claude 3 Sonnet

by Anthropic · Released Jan 2024

32.0
avg score
Rank #172
Compare
Better than 26% of all models
Context
N/A
Input $/1M
TBD
Output $/1M
TBD
Type
text
License
Proprietary
Benchmarks
6 tested
Data updated today
About

Tested on 6 benchmarks with 28.3% average. Top scores: MMLU (67.9%), Winogrande (50.2%), GPQA diamond (20.8%).

Capabilities
coding
10.2
#137 globally
math
10.3
#186 globally
knowledge
46.3
#119 globally
Benchmark Scores
Compare All
Tested on 6 benchmarks · Ranked across 3 categories
Score Distribution (all 233 models)
0255075100
▲ You are here
WeirdML

Unusual and adversarial machine learning challenges. Tests robustness of reasoning about edge cases in ML systems.

10.2
MATH level 5

Competition-level math from AMC, AIME, and olympiad problems. Level 5 is the hardest tier, requiring creative problem-solving.

18.2
OTIS Mock AIME 2024-2025

Mock AIME (American Invitational Mathematics Exam) problems from OTIS. Tests mathematical competition performance.

2.4
MMLU

Massive Multitask Language Understanding. 57 subjects from STEM, humanities, and social sciences. The most widely-cited knowledge benchmark.

67.9
Winogrande

Commonsense coreference resolution. Tests understanding of pronoun references in ambiguous sentences.

50.2
GPQA diamond

Graduate-level science questions written by PhD experts. Diamond subset contains questions where experts disagree, testing deep understanding.

20.8
Excellent (85+) Good (70-85) Average (50-70) Below (<50)
Links
Documentation
Community
BenchGecko API
claude-3-sonnet
Specifications
  • Typetext
  • ContextN/A
  • ReleasedJan 2024
  • LicenseProprietary
  • Statusbenchmark-only
Available On
Anthropic logoAnthropicTBD
Share & Export
Tweet
Claude 3 Sonnet is a proprietary text AI model by Anthropic, released in January 2024. It has an average benchmark score of 32.0.