Compare · ModelsLive · 2 picked · head to head
Gemma 2 2b It vs Qwen2.5 3B Instruct
Side by side · benchmarks, pricing, and signals you can act on.
Winner summary
Qwen2.5 3B Instruct wins on 5/6 benchmarks
Qwen2.5 3B Instruct wins 5 of 6 shared benchmarks. Leads in general · language · math.
Category leads
general·Qwen2.5 3B Instructknowledge·Gemma 2 2b Itlanguage·Qwen2.5 3B Instructmath·Qwen2.5 3B Instructreasoning·Qwen2.5 3B Instruct
Hype vs Reality
Attention vs performance
Gemma 2 2b It
#161 by perf·no signal
Qwen2.5 3B Instruct
#199 by perf·no signal
Vendor risk
Who is behind the model
Google DeepMind
$4.00T·Tier 1
Alibaba (Qwen)
$293.0B·Tier 1
Head to head
6 benchmarks · 2 models
Gemma 2 2b ItQwen2.5 3B Instruct
BBH (HuggingFace)
Qwen2.5 3B Instruct leads by +7.8
Gemma 2 2b It
18.0
Qwen2.5 3B Instruct
25.8
GPQA
Gemma 2 2b It leads by +0.2
Gemma 2 2b It
3.2
Qwen2.5 3B Instruct
3.0
IFEval
Qwen2.5 3B Instruct leads by +8.1
Gemma 2 2b It
56.7
Qwen2.5 3B Instruct
64.8
MATH Level 5
Qwen2.5 3B Instruct leads by +36.7
Gemma 2 2b It
0.1
Qwen2.5 3B Instruct
36.8
MMLU-PRO
Qwen2.5 3B Instruct leads by +7.8
Gemma 2 2b It
17.2
Qwen2.5 3B Instruct
25.1
MUSR
Qwen2.5 3B Instruct leads by +0.5
Gemma 2 2b It
7.1
Qwen2.5 3B Instruct
7.6
Full benchmark table
| Benchmark | Gemma 2 2b It | Qwen2.5 3B Instruct |
|---|---|---|
BBH (HuggingFace) | 18.0 | 25.8 |
GPQA | 3.2 | 3.0 |
IFEval | 56.7 | 64.8 |
MATH Level 5 | 0.1 | 36.8 |
MMLU-PRO | 17.2 | 25.1 |
MUSR | 7.1 | 7.6 |
Pricing · per 1M tokens · projected $/mo at 10M tokens
| Model | Input | Output | Context | Projected $/mo |
|---|---|---|---|---|
| — | — | — | — | |
| — | — | — | — |