How much does Claude 2.1 cost?

Claude 2.1 pricing information is not yet available.

What benchmarks has Claude 2.1 been tested on?

Claude 2.1 has been evaluated on 4 benchmarks. Top scores: MMLU: 64.7, GPQA diamond: 10.6, WeirdML: 7.1.

Is Claude 2.1 open source?

No, Claude 2.1 is a proprietary model by Anthropic.

How does Claude 2.1 compare to Qwen2 1.5B Instruct?

Claude 2.1 has an average score of 24.0 while Qwen2 1.5B Instruct scores 24.3. Qwen2 1.5B Instruct slightly outperforms Claude 2.1 overall. See full comparison →

Home/Models/Claude 2.1

Claude 2.1

Name: Claude 2.1
Author: Anthropic

by Anthropic · Released Jan 2024

24.0

avg score

Rank #199

Compare

Better than 15% of all models

Context

N/A

Input $/1M

TBD

Output $/1M

TBD

Type

text

License

Proprietary

Benchmarks

4 tested

Data updated today

About

Tested on 4 benchmarks with 21.0% average. Top scores: MMLU (64.7%), GPQA diamond (10.6%), WeirdML (7.1%).

Capabilities

coding

7.1

#140 globally

math

1.9

#206 globally

knowledge

37.6

#155 globally

Benchmark Scores

Compare All

Tested on 4 benchmarks · Ranked across 3 categories

Score Distribution (all 233 models)

0255075100

▲ You are here

codingCompare coding →

WeirdML

Unusual and adversarial machine learning challenges. Tests robustness of reasoning about edge cases in ML systems.

7.1—

mathCompare math →

OTIS Mock AIME 2024-2025

Mock AIME (American Invitational Mathematics Exam) problems from OTIS. Tests mathematical competition performance.

1.9—

knowledgeCompare knowledge →

MMLU

Massive Multitask Language Understanding. 57 subjects from STEM, humanities, and social sciences. The most widely-cited knowledge benchmark.

64.7—

GPQA diamond

Graduate-level science questions written by PhD experts. Diamond subset contains questions where experts disagree, testing deep understanding.

10.6—

Quick compare:

vs Qwen2 1.5B Instruct

vs Mixtral 8x22B Instruct

vs GPT-5.4 Mini

Excellent (85+) Good (70-85) Average (50-70) Below (<50)

Similar Models

Qwen2 1.5B Instruct

Alibaba

24.3TBD

Mixtral 8x22B Instruct

Links

Info

Anthropic Pricing explorer Developers · API

Research

Technical Report

Documentation

API Docs Playground

Community

@Anthropic

BenchGecko API

claude-2-1

Specifications

Typetext
ContextN/A
ReleasedJan 2024
LicenseProprietary
Statusbenchmark-only

Available On

AnthropicTBD

Frequently Asked Questions

Claude 2.1 is a proprietary text AI model by Anthropic, released in January 2024. It has an average benchmark score of 24.0.

Benchmarks

MMLU GPQA diamond WeirdML OTIS Mock AIME 2024-2025

Anthropic · Provider Anthropic · Economy All Models Compare Models Pricing Developers · API

Claude 2.1

Frequently Asked Questions

Related Models

Benchmarks

Related Pages