Compare · ModelsLive · 2 picked · head to head
Gemma 2 9B vs Phi 3 Mini 4k Instruct
Side by side · benchmarks, pricing, and signals you can act on.
Winner summary
Gemma 2 9B wins on 5/7 benchmarks
Gemma 2 9B wins 5 of 7 shared benchmarks. Leads in arena · general · knowledge.
Category leads
arena·Gemma 2 9Bgeneral·Gemma 2 9Bknowledge·Gemma 2 9Blanguage·Gemma 2 9Bmath·Gemma 2 9Breasoning·Phi 3 Mini 4k Instruct
Hype vs Reality
Attention vs performance
Gemma 2 9B
#165 by perf·no signal
Phi 3 Mini 4k Instruct
#198 by perf·no signal
Vendor risk
Who is behind the model
Google DeepMind
$4.00T·Tier 1
Microsoft
$3.00T·Big Tech
Head to head
7 benchmarks · 2 models
Gemma 2 9BPhi 3 Mini 4k Instruct
Chatbot Arena Elo · Overall
Gemma 2 9B leads by +137.8
Gemma 2 9B
1265.0
Phi 3 Mini 4k Instruct
1127.2
BBH (HuggingFace)
Gemma 2 9B leads by +5.6
Gemma 2 9B
42.1
Phi 3 Mini 4k Instruct
36.6
GPQA
Gemma 2 9B leads by +3.8
Gemma 2 9B
14.8
Phi 3 Mini 4k Instruct
11.0
IFEval
Gemma 2 9B leads by +19.6
Gemma 2 9B
74.4
Phi 3 Mini 4k Instruct
54.8
MATH Level 5
Gemma 2 9B leads by +3.1
Gemma 2 9B
19.5
Phi 3 Mini 4k Instruct
16.4
MMLU-PRO
Phi 3 Mini 4k Instruct leads by +1.6
Gemma 2 9B
31.9
Phi 3 Mini 4k Instruct
33.6
MUSR
Phi 3 Mini 4k Instruct leads by +3.4
Gemma 2 9B
9.7
Phi 3 Mini 4k Instruct
13.1
Full benchmark table
| Benchmark | Gemma 2 9B | Phi 3 Mini 4k Instruct |
|---|---|---|
Chatbot Arena Elo · Overall | 1265.0 | 1127.2 |
BBH (HuggingFace) | 42.1 | 36.6 |
GPQA | 14.8 | 11.0 |
IFEval | 74.4 | 54.8 |
MATH Level 5 | 19.5 | 16.4 |
MMLU-PRO | 31.9 | 33.6 |
MUSR | 9.7 | 13.1 |
Pricing · per 1M tokens · projected $/mo at 10M tokens
| Model | Input | Output | Context | Projected $/mo |
|---|---|---|---|---|
| $0.03 | $0.09 | 8K tokens (~4 books) | $0.45 | |
| — | — | — | — |