Compare · ModelsLive · 2 picked · head to head
Qwen2.5 32B Instruct vs Llama 3.1 70B Instruct
Side by side · benchmarks, pricing, and signals you can act on.
Winner summary
Qwen2.5 32B Instruct wins on 3/6 benchmarks
Qwen2.5 32B Instruct wins 3 of 6 shared benchmarks. Leads in general · math.
Category leads
general·Qwen2.5 32B Instructknowledge·Llama 3.1 70B Instructlanguage·Llama 3.1 70B Instructmath·Qwen2.5 32B Instructreasoning·Llama 3.1 70B Instruct
Hype vs Reality
Attention vs performance
Qwen2.5 32B Instruct
#125 by perf·no signal
Llama 3.1 70B Instruct
#152 by perf·no signal
Best value
Llama 3.1 70B Instruct
Qwen2.5 32B Instruct
—
no price
Llama 3.1 70B Instruct
94.5 pts/$
$0.40/M
Vendor risk
Who is behind the model
Alibaba (Qwen)
$293.0B·Tier 1
Meta AI
$1.50T·Tier 1
Head to head
6 benchmarks · 2 models
Qwen2.5 32B InstructLlama 3.1 70B Instruct
BBH (HuggingFace)
Qwen2.5 32B Instruct leads by +0.6
Qwen2.5 32B Instruct
56.5
Llama 3.1 70B Instruct
55.9
GPQA
Llama 3.1 70B Instruct leads by +2.5
Qwen2.5 32B Instruct
11.7
Llama 3.1 70B Instruct
14.2
IFEval
Llama 3.1 70B Instruct leads by +3.2
Qwen2.5 32B Instruct
83.5
Llama 3.1 70B Instruct
86.7
MATH Level 5
Qwen2.5 32B Instruct leads by +24.5
Qwen2.5 32B Instruct
62.5
Llama 3.1 70B Instruct
38.1
MMLU-PRO
Qwen2.5 32B Instruct leads by +4.0
Qwen2.5 32B Instruct
51.9
Llama 3.1 70B Instruct
47.9
MUSR
Llama 3.1 70B Instruct leads by +4.2
Qwen2.5 32B Instruct
13.5
Llama 3.1 70B Instruct
17.7
Full benchmark table
| Benchmark | Qwen2.5 32B Instruct | Llama 3.1 70B Instruct |
|---|---|---|
BBH (HuggingFace) | 56.5 | 55.9 |
GPQA | 11.7 | 14.2 |
IFEval | 83.5 | 86.7 |
MATH Level 5 | 62.5 | 38.1 |
MMLU-PRO | 51.9 | 47.9 |
MUSR | 13.5 | 17.7 |
Pricing · per 1M tokens · projected $/mo at 10M tokens
| Model | Input | Output | Context | Projected $/mo |
|---|---|---|---|---|
| — | — | — | — | |
| $0.40 | $0.40 | 131K tokens (~66 books) | $4.00 |