Compare · ModelsLive · 2 picked · head to head
Claude Sonnet 4.5 vs Qwen3 235B A22B Instruct 2507
Side by side · benchmarks, pricing, and signals you can act on.
Winner summary
Claude Sonnet 4.5 wins on 3/3 benchmarks
Claude Sonnet 4.5 wins 3 of 3 shared benchmarks. Leads in reasoning · coding.
Category leads
reasoning·Claude Sonnet 4.5coding·Claude Sonnet 4.5
Hype vs Reality
Attention vs performance
Claude Sonnet 4.5
#132 by perf·no signal
Qwen3 235B A22B Instruct 2507
#99 by perf·no signal
Best value
Qwen3 235B A22B Instruct 2507
121.3x better value than Claude Sonnet 4.5
Claude Sonnet 4.5
4.7 pts/$
$9.00/M
Qwen3 235B A22B Instruct 2507
567.3 pts/$
$0.09/M
Vendor risk
Who is behind the model
Anthropic
$380.0B·Tier 1
Alibaba (Qwen)
$293.0B·Tier 1
Head to head
3 benchmarks · 2 models
Claude Sonnet 4.5Qwen3 235B A22B Instruct 2507
ARC-AGI
Claude Sonnet 4.5 leads by +52.7
ARC-AGI · the original Abstraction and Reasoning Corpus, testing whether AI can solve novel visual pattern recognition tasks without memorization.
Claude Sonnet 4.5
63.7
Qwen3 235B A22B Instruct 2507
11.0
ARC-AGI-2
Claude Sonnet 4.5 leads by +12.4
ARC-AGI-2 · the second iteration of the Abstraction and Reasoning Corpus, testing novel pattern recognition and abstract reasoning without prior training data.
Claude Sonnet 4.5
13.6
Qwen3 235B A22B Instruct 2507
1.3
WeirdML
Claude Sonnet 4.5 leads by +9.0
WeirdML · tests models on unusual and adversarial machine learning tasks that require creative problem-solving beyond standard patterns.
Claude Sonnet 4.5
47.7
Qwen3 235B A22B Instruct 2507
38.7
Full benchmark table
| Benchmark | Claude Sonnet 4.5 | Qwen3 235B A22B Instruct 2507 |
|---|---|---|
ARC-AGI ARC-AGI · the original Abstraction and Reasoning Corpus, testing whether AI can solve novel visual pattern recognition tasks without memorization. | 63.7 | 11.0 |
ARC-AGI-2 ARC-AGI-2 · the second iteration of the Abstraction and Reasoning Corpus, testing novel pattern recognition and abstract reasoning without prior training data. | 13.6 | 1.3 |
WeirdML WeirdML · tests models on unusual and adversarial machine learning tasks that require creative problem-solving beyond standard patterns. | 47.7 | 38.7 |
Pricing · per 1M tokens · projected $/mo at 10M tokens
| Model | Input | Output | Context | Projected $/mo |
|---|---|---|---|---|
| $3.00 | $15.00 | 1.0M tokens (~500 books) | $60.00 | |
| $0.07 | $0.10 | 262K tokens (~131 books) | $0.78 |