How much does Claude Mythos Preview cost?

Claude Mythos Preview pricing information is not yet available.

What benchmarks has Claude Mythos Preview been tested on?

Claude Mythos Preview has been evaluated on 14 benchmarks. Top scores: USAMO: 97.6, GPQA diamond: 94.5, SWE-Bench verified: 93.9.

Is Claude Mythos Preview open source?

No, Claude Mythos Preview is a proprietary model by Anthropic.

How does Claude Mythos Preview compare to Qwen3.5 397B A17B?

Claude Mythos Preview has an average score of 100.0 while Qwen3.5 397B A17B scores 96.3. Claude Mythos Preview outperforms Qwen3.5 397B A17B overall. See full comparison →

Home/Models/Claude Mythos Preview

Claude Mythos Preview

Name: Claude Mythos Preview
Author: Anthropic

by Anthropic · Released Apr 2026

1M ContextPreview

100.0

avg score

Rank #1

Compare

#1 ranked model today

Context

1.0M tokens (~500 books)

Input $/1M

TBD

Output $/1M

TBD

Type

text

License

Proprietary

Benchmarks

14 tested

Data updated today

About

Anthropic's most capable model · Claude Mythos Preview. Tops SWE-bench Verified (93.9%), GPQA Diamond (94.5%), USAMO (97.6%), and HLE with tools (64.7%). Adaptive thinking at max effort, context up to 1M tokens.

Tested on 14 benchmarks with 81.8% average. Top scores: USAMO (97.6%), GPQA diamond (94.5%), SWE-Bench verified (93.9%).

Capabilities

coding

80.0

#6 globally

reasoning

81.0

#7 globally

math

97.6

#1 globally

knowledge

81.3

#3 globally

agentic

79.6

#1 globally

Benchmark Scores

Compare All

Tested on 14 benchmarks · Ranked across 5 categories

Score Distribution (all 231 models)

0255075100

▲ You are here

codingCompare coding →

SWE-Bench verified

Real-world software engineering tasks from GitHub issues. Models must diagnose bugs and write patches that pass test suites. Human-verified subset of SWE-bench.

93.9—

SWE-bench Multilingual

SWE-bench extended to non-Python languages. Tests coding ability across Java, JS, Go, Rust, and more.

87.3—

Terminal Bench

Complex terminal-based engineering tasks. Models must use command-line tools, navigate filesystems, and debug systems through shell interaction.

82.0—

reasoningCompare reasoning →

CharXiv Reasoning (with tools)

Chart reasoning with tool use. Models can use code execution to analyze scientific figures.

93.2—

CharXiv Reasoning

Chart and figure reasoning from arXiv papers. Tests ability to interpret scientific visualizations.

86.1—

GraphWalks BFS 256K-1M

Graph traversal benchmark at 256K context. Tests ability to follow breadth-first search paths in large graph structures.

80.0—

mathCompare math →

USAMO

USA Mathematical Olympiad problems. Among the hardest math competitions, requiring elegant proofs and deep mathematical insight.

97.6—

Quick compare:

vs Qwen3.5 397B A17B

vs DeepSeek V3.2 Speciale

vs GPT-5.4 Pro

Excellent (85+) Good (70-85) Average (50-70) Below (<50)

Recently Happened

Claude Mythos Preview · Anthropic's most capable model arrives

Apr 7, 2026

Similar Models

Qwen3.5 397B A17B

Alibaba Qwen

96.3$0.39/1M

DeepSeek V3.2 Speciale

Links

Info

Research

Documentation

Community

BenchGecko API

claude-mythos-preview

Specifications

Typetext
Context1.0M tokens (~500 books)
ReleasedApr 2026
LicenseProprietary
Statuspreview

Available On

AnthropicTBD

Frequently Asked Questions

Claude Mythos Preview is a proprietary text AI model by Anthropic, released in April 2026. It has an average benchmark score of 100.0. Context window: 1M tokens.

Benchmarks

USAMO GPQA diamond SWE-Bench verified CharXiv Reasoning (with tools)MMMLU

Anthropic · Provider All Models Compare Models Context Window · Glossary

Claude Mythos Preview

Frequently Asked Questions

Related Models

Benchmarks

Related Pages