How much does Claude 2 cost?

Claude 2 pricing information is not yet available.

What benchmarks has Claude 2 been tested on?

Claude 2 has been evaluated on 5 benchmarks. Top scores: TriviaQA: 87.5, MMLU: 71.3, GPQA diamond: 12.9.

Is Claude 2 open source?

No, Claude 2 is a proprietary model by Anthropic.

How does Claude 2 compare to Llama 3 8B Instruct?

Claude 2 has an average score of 41.8 while Llama 3 8B Instruct scores 41.7. Claude 2 outperforms Llama 3 8B Instruct overall. See full comparison →

Home/Models/Claude 2

Claude 2

Name: Claude 2
Author: Anthropic

by Anthropic · Released Jan 2024

41.8

avg score

Rank #137

Compare

Better than 41% of all models

Context

N/A

Input $/1M

TBD

Output $/1M

TBD

Type

text

License

Proprietary

Benchmarks

5 tested

Data updated today

About

Tested on 5 benchmarks with 37.2% average. Top scores: TriviaQA (87.5%), MMLU (71.3%), GPQA diamond (12.9%).

Capabilities

math

7.1

#194 globally

knowledge

57.2

#64 globally

Benchmark Scores

Compare All

Tested on 5 benchmarks · Ranked across 2 categories

Score Distribution (all 233 models)

0255075100

▲ You are here

mathCompare math →

MATH level 5

Competition-level math from AMC, AIME, and olympiad problems. Level 5 is the hardest tier, requiring creative problem-solving.

11.7—

OTIS Mock AIME 2024-2025

Mock AIME (American Invitational Mathematics Exam) problems from OTIS. Tests mathematical competition performance.

2.4—

knowledgeCompare knowledge →

TriviaQA

Trivia questions sourced from trivia enthusiasts and quiz websites. Tests breadth of general knowledge.

87.5—

MMLU

Massive Multitask Language Understanding. 57 subjects from STEM, humanities, and social sciences. The most widely-cited knowledge benchmark.

71.3—

GPQA diamond

Graduate-level science questions written by PhD experts. Diamond subset contains questions where experts disagree, testing deep understanding.

12.9—

Quick compare:

vs Llama 3 8B Instruct

vs Gemini 1.5 Pro (Feb 2024)

vs R1 Distill Qwen 32B

Excellent (85+) Good (70-85) Average (50-70) Below (<50)

Similar Models

Llama 3 8B Instruct

Frequently Asked Questions

Claude 2 is a proprietary text AI model by Anthropic, released in January 2024. It has an average benchmark score of 41.8.

Benchmarks

TriviaQA MMLU GPQA diamond MATH level 5 OTIS Mock AIME 2024-2025

Anthropic · Provider Anthropic · Economy All Models Compare Models Pricing Developers · API

Claude 2

Frequently Asked Questions

Related Models

Benchmarks

Related Pages