Compare · ModelsLive · 3 picked · head to head
Qwen2.5 72B Instruct vs Qwen2.5 72B Instruct Abliterated vs Llama 3.3 70B Instruct
Side by side · benchmarks, pricing, and signals you can act on.
Winner summary
Qwen2.5 72B Instruct wins on 3/8 benchmarks
Qwen2.5 72B Instruct wins 3 of 8 shared benchmarks. Leads in general · coding.
Category leads
general·Qwen2.5 72B Instructknowledge·Qwen2.5 72B Instruct Abliteratedlanguage·Llama 3.3 70B Instructmath·Qwen2.5 72B Instruct Abliteratedreasoning·Llama 3.3 70B Instructcoding·Qwen2.5 72B Instructarena·Llama 3.3 70B Instruct
Hype vs Reality
Attention vs performance
Qwen2.5 72B Instruct
#82 by perf·no signal
Qwen2.5 72B Instruct Abliterated
#100 by perf·no signal
Llama 3.3 70B Instruct
#109 by perf·no signal
Best value
Llama 3.3 70B Instruct
1.6x better value than Qwen2.5 72B Instruct
Qwen2.5 72B Instruct
140.0 pts/$
$0.38/M
Qwen2.5 72B Instruct Abliterated
—
no price
Llama 3.3 70B Instruct
223.3 pts/$
$0.21/M
Vendor risk
Who is behind the model
Alibaba (Qwen)
$293.0B·Tier 1
HA
HuiHui AI
private · undisclosed
Meta AI
$1.50T·Tier 1
Head to head
8 benchmarks · 3 models
Qwen2.5 72B InstructQwen2.5 72B Instruct AbliteratedLlama 3.3 70B Instruct
BBH (HuggingFace)
Qwen2.5 72B Instruct leads by +1.4
Qwen2.5 72B Instruct
61.9
Qwen2.5 72B Instruct Abliterated
60.5
Llama 3.3 70B Instruct
56.6
GPQA
Qwen2.5 72B Instruct Abliterated leads by +2.7
Qwen2.5 72B Instruct
16.7
Qwen2.5 72B Instruct Abliterated
19.4
Llama 3.3 70B Instruct
10.5
IFEval
Llama 3.3 70B Instruct leads by +3.6
Qwen2.5 72B Instruct
86.4
Qwen2.5 72B Instruct Abliterated
85.9
Llama 3.3 70B Instruct
90.0
MATH Level 5
Qwen2.5 72B Instruct Abliterated leads by +0.3
Qwen2.5 72B Instruct
59.8
Qwen2.5 72B Instruct Abliterated
60.1
Llama 3.3 70B Instruct
48.3
MMLU-PRO
Qwen2.5 72B Instruct leads by +1.0
Qwen2.5 72B Instruct
51.4
Qwen2.5 72B Instruct Abliterated
50.4
Llama 3.3 70B Instruct
48.1
MUSR
Llama 3.3 70B Instruct leads by +3.2
Qwen2.5 72B Instruct
11.7
Qwen2.5 72B Instruct Abliterated
12.3
Llama 3.3 70B Instruct
15.6
Aider · Code Editing
Qwen2.5 72B Instruct leads by +6.0
Qwen2.5 72B Instruct
65.4
Llama 3.3 70B Instruct
59.4
Chatbot Arena Elo · Overall
Llama 3.3 70B Instruct leads by +15.7
Qwen2.5 72B Instruct
1302.3
Llama 3.3 70B Instruct
1318.0
Full benchmark table
| Benchmark | Qwen2.5 72B Instruct | Qwen2.5 72B Instruct Abliterated | Llama 3.3 70B Instruct |
|---|---|---|---|
BBH (HuggingFace) | 61.9 | 60.5 | 56.6 |
GPQA | 16.7 | 19.4 | 10.5 |
IFEval | 86.4 | 85.9 | 90.0 |
MATH Level 5 | 59.8 | 60.1 | 48.3 |
MMLU-PRO | 51.4 | 50.4 | 48.1 |
MUSR | 11.7 | 12.3 | 15.6 |
Aider · Code Editing | 65.4 | — | 59.4 |
Chatbot Arena Elo · Overall | 1302.3 | — | 1318.0 |
Pricing · per 1M tokens · projected $/mo at 10M tokens
| Model | Input | Output | Context | Projected $/mo |
|---|---|---|---|---|
| $0.36 | $0.40 | 33K tokens (~16 books) | $3.70 | |
HA Qwen2.5 72B Instruct Abliterated | — | — | — | — |
| $0.10 | $0.32 | 131K tokens (~66 books) | $1.55 |
People also compared
GPT-5.5 Pro vs Llama 3.3 70B InstructGPT-5.5 vs Llama 3.3 70B InstructClaude Mythos Preview vs Llama 3.3 70B InstructLlama 3.3 70B Instruct vs Qwen3.5 397B A17BDeepSeek V3.2 Speciale vs Llama 3.3 70B InstructClaude Instant vs Llama 3.3 70B InstructDeepSeek-V2 (MoE-236B, May 2024) vs Llama 3.3 70B InstructLlama 3.3 70B Instruct vs Qwen3.6 Plus