How much does Claude Opus 4 cost?

Claude Opus 4 costs $15.00 per million input tokens and $75.00 per million output tokens. For a typical conversation (~2,000 tokens), that's approximately $0.105 per message.

What benchmarks has Claude Opus 4 been tested on?

Claude Opus 4 has been evaluated on 19 benchmarks. Top scores: MATH level 5: 85.0, Aider polyglot: 72.0, SWE-Bench verified: 70.7.

Is Claude Opus 4 open source?

No, Claude Opus 4 is a proprietary model by Anthropic.

How does Claude Opus 4 compare to Grok 3 Mini?

Claude Opus 4 has an average score of 46.0 while Grok 3 Mini scores 46.0. Grok 3 Mini slightly outperforms Claude Opus 4 overall. Claude Opus 4 costs $15.00/1M input vs Grok 3 Mini at $0.30/1M input. See full comparison →

Home/Models/Claude Opus 4

Claude Opus 4

Name: Claude Opus 4
Price: 15 USD
Author: Anthropic

by Anthropic · Released May 2025

Multimodal

46.0

avg score

Rank #120

Compare

Better than 48% of all models

Context

200K tokens (~100 books)

Input $/1M

$15.00

Output $/1M

$75.00

Type

multimodal

License

Proprietary

Benchmarks

19 tested

Data updated today

About

Claude Opus 4 is benchmarked as the world’s best coding model, at time of release, bringing sustained performance on complex, long-running tasks and agent workflows. It sets new benchmarks in...

Tested on 19 benchmarks with 41.7% average. Top scores: MATH level 5 (85.0%), Aider polyglot (72.0%), SWE-Bench verified (70.7%).

Looking for similar performance at lower cost?
Qwen3 235B A22B Instruct 2507 scores 45.7 (99% as good) at $0.07/1M input · 100% cheaper

Capabilities

coding

49.8

#67 globally

reasoning

31.6

#81 globally

math

39.5

#106 globally

knowledge

40.1

#151 globally

Benchmark Scores

Compare All

Tested on 19 benchmarks · Ranked across 4 categories

Score Distribution (all 233 models)

0255075100

▲ You are here

codingCompare coding →

Aider polyglot

Multi-language code editing from Aider. Tests editing ability across Python, JavaScript, TypeScript, Java, C++, Go, Rust, and more.

72.0—

SWE-Bench verified

Real-world software engineering tasks from GitHub issues. Models must diagnose bugs and write patches that pass test suites. Human-verified subset of SWE-bench.

70.7—

SWE-Bench Verified (Bash Only)

SWE-bench Verified solved using only bash commands, no specialized frameworks. Tests raw terminal-based problem solving.

67.6—

reasoningCompare reasoning →

SimpleBench

Deceptively simple questions that humans find easy but AI models often get wrong. Tests common sense and reasoning gaps.

50.6—

ARC-AGI

Abstraction and Reasoning Corpus. Tests fluid intelligence through novel visual pattern recognition puzzles. Core measure of general intelligence.

35.7—

ARC-AGI-2

ARC-AGI 2, harder sequel to ARC. More complex abstract reasoning patterns that test generalization ability beyond training data.

8.6—

mathCompare math →

MATH level 5

Competition-level math from AMC, AIME, and olympiad problems. Level 5 is the hardest tier, requiring creative problem-solving.

85.0—

OTIS Mock AIME 2024-2025

Mock AIME (American Invitational Mathematics Exam) problems from OTIS. Tests mathematical competition performance.

64.4—

FrontierMath-2025-02-28-Private

Original research-level math problems created by professional mathematicians. Problems are unpublished and cannot be memorized.

4.5—

Quick compare:

vs Grok 3 Mini

vs Qwen3 235B A22B Instruct 2507

vs o1-preview

Excellent (85+) Good (70-85) Average (50-70) Below (<50)

Model Family · Anthropic Claude Opus

Claude 3 OpusJan 2024

33.7

N/AN/Actx8 benchmarks

Claude Opus 4May 2025

41.7+8.0

$15.00/M in200Kctx19 benchmarks

Claude Opus 4.1Aug 2025

41.3-0.4

$15.00/M in200Kctx14 benchmarks

Claude Opus 4.5Nov 2025

45.4+4.1

$5.00/M in(-10)200Kctx28 benchmarks

Claude Opus 4.6Feb 2026

57.5+12.1

$5.00/M in1.0Mctx(+800K)19 benchmarks

Claude Opus 4.6 (Fast)Apr 2026

43.4-14.1

$30.00/M in(+25)1.0Mctx12 benchmarks

Claude Opus 4.7Apr 2026

$5.00/M in(-25)1.0Mctx

See the full Claude Opus family →

Similar Models

Grok 3 Mini

xAI

46.0$0.30/1M

Qwen3 235B A22B Instruct 2507

Links

Info

Anthropic Pricing explorer Developers · API

Research

Technical Report

Documentation

API Docs Playground

Community

@Anthropic

BenchGecko API

claude-opus-4

Specifications

Typemultimodal
Context200K tokens (~100 books)
ReleasedMay 2025
LicenseProprietary
StatusActive
Cost / Message~$0.105

Available On

Anthropic$15.00

Frequently Asked Questions

Claude Opus 4 is a proprietary multimodal AI model by Anthropic, released in May 2025. It has an average benchmark score of 46.0. Context window: 200K tokens.

Benchmarks

MATH level 5 Aider polyglot SWE-Bench verified GPQA diamond SWE-Bench Verified (Bash Only)

Anthropic · Provider Anthropic · Economy All Models Compare Models Pricing Developers · API Context Window · Glossary

Claude Opus 4

Frequently Asked Questions

Related Models

Benchmarks

Related Pages