Compare · ModelsLive · 2 picked · head to head
Llama 3.1 70B Instruct vs Llama 3.3 70B Instruct
Side by side · benchmarks, pricing, and signals you can act on.
Winner summary
Llama 3.3 70B Instruct wins on 6/8 benchmarks
Llama 3.3 70B Instruct wins 6 of 8 shared benchmarks. Leads in coding · arena · general.
Category leads
coding·Llama 3.3 70B Instructarena·Llama 3.3 70B Instructgeneral·Llama 3.3 70B Instructknowledge·Llama 3.1 70B Instructlanguage·Llama 3.3 70B Instructmath·Llama 3.3 70B Instructreasoning·Llama 3.1 70B Instruct
Hype vs Reality
Attention vs performance
Llama 3.1 70B Instruct
#152 by perf·no signal
Llama 3.3 70B Instruct
#107 by perf·no signal
Best value
Llama 3.3 70B Instruct
2.4x better value than Llama 3.1 70B Instruct
Llama 3.1 70B Instruct
94.5 pts/$
$0.40/M
Llama 3.3 70B Instruct
223.3 pts/$
$0.21/M
Vendor risk
Who is behind the model
Meta AI
$1.50T·Tier 1
Meta AI
$1.50T·Tier 1
Head to head
8 benchmarks · 2 models
Llama 3.1 70B InstructLlama 3.3 70B Instruct
Aider · Code Editing
Llama 3.3 70B Instruct leads by +0.8
Llama 3.1 70B Instruct
58.6
Llama 3.3 70B Instruct
59.4
Chatbot Arena Elo · Overall
Llama 3.3 70B Instruct leads by +25.2
Llama 3.1 70B Instruct
1292.8
Llama 3.3 70B Instruct
1318.0
BBH (HuggingFace)
Llama 3.3 70B Instruct leads by +0.6
Llama 3.1 70B Instruct
55.9
Llama 3.3 70B Instruct
56.6
GPQA
Llama 3.1 70B Instruct leads by +3.7
Llama 3.1 70B Instruct
14.2
Llama 3.3 70B Instruct
10.5
IFEval
Llama 3.3 70B Instruct leads by +3.3
Llama 3.1 70B Instruct
86.7
Llama 3.3 70B Instruct
90.0
MATH Level 5
Llama 3.3 70B Instruct leads by +10.3
Llama 3.1 70B Instruct
38.1
Llama 3.3 70B Instruct
48.3
MMLU-PRO
Llama 3.3 70B Instruct leads by +0.3
Llama 3.1 70B Instruct
47.9
Llama 3.3 70B Instruct
48.1
MUSR
Llama 3.1 70B Instruct leads by +2.1
Llama 3.1 70B Instruct
17.7
Llama 3.3 70B Instruct
15.6
Full benchmark table
| Benchmark | Llama 3.1 70B Instruct | Llama 3.3 70B Instruct |
|---|---|---|
Aider · Code Editing | 58.6 | 59.4 |
Chatbot Arena Elo · Overall | 1292.8 | 1318.0 |
BBH (HuggingFace) | 55.9 | 56.6 |
GPQA | 14.2 | 10.5 |
IFEval | 86.7 | 90.0 |
MATH Level 5 | 38.1 | 48.3 |
MMLU-PRO | 47.9 | 48.1 |
MUSR | 17.7 | 15.6 |
Pricing · per 1M tokens · projected $/mo at 10M tokens
| Model | Input | Output | Context | Projected $/mo |
|---|---|---|---|---|
| $0.40 | $0.40 | 131K tokens (~66 books) | $4.00 | |
| $0.10 | $0.32 | 131K tokens (~66 books) | $1.55 |
People also compared
GPT-5 Chat vs Llama 3.3 70B InstructClaude Mythos Preview vs Llama 3.3 70B InstructLlama 3.3 70B Instruct vs Qwen3.5 397B A17BDeepSeek V3.2 Speciale vs Llama 3.3 70B InstructClaude Instant vs Llama 3.3 70B InstructDeepSeek-V2 (MoE-236B, May 2024) vs Llama 3.3 70B InstructGPT-5.1-Codex-Max vs Llama 3.3 70B InstructLlama 3.3 70B Instruct vs Qwen3.6 Plus