Compare · ModelsLive · 2 picked · head to head
Gemini 2.0 Flash vs gpt-oss-20b
Side by side · benchmarks, pricing, and signals you can act on.
Winner summary
gpt-oss-20b wins on 3/6 benchmarks
gpt-oss-20b wins 3 of 6 shared benchmarks. Leads in knowledge · math.
Category leads
arena·Gemini 2.0 Flashknowledge·gpt-oss-20blanguage·Gemini 2.0 Flashmath·gpt-oss-20breasoning·Gemini 2.0 Flash
Hype vs Reality
Attention vs performance
Gemini 2.0 Flash
#101 by perf·no signal
gpt-oss-20b
#24 by perf·no signal
Best value
gpt-oss-20b
4.1x better value than Gemini 2.0 Flash
Gemini 2.0 Flash
192.0 pts/$
$0.25/M
gpt-oss-20b
792.9 pts/$
$0.09/M
Vendor risk
Who is behind the model
Google DeepMind
$4.00T·Tier 1
OpenAI
$840.0B·Tier 1
Head to head
6 benchmarks · 2 models
Gemini 2.0 Flashgpt-oss-20b
Chatbot Arena Elo · Overall
Gemini 2.0 Flash leads by +42.3
Gemini 2.0 Flash
1360.0
gpt-oss-20b
1317.7
HELM · GPQA
gpt-oss-20b leads by +3.8
Gemini 2.0 Flash
55.6
gpt-oss-20b
59.4
HELM · IFEval
Gemini 2.0 Flash leads by +10.9
Gemini 2.0 Flash
84.1
gpt-oss-20b
73.2
HELM · MMLU-Pro
gpt-oss-20b leads by +0.3
Gemini 2.0 Flash
73.7
gpt-oss-20b
74.0
HELM · Omni-MATH
gpt-oss-20b leads by +10.6
Gemini 2.0 Flash
45.9
gpt-oss-20b
56.5
HELM · WildBench
Gemini 2.0 Flash leads by +6.3
Gemini 2.0 Flash
80.0
gpt-oss-20b
73.7
Full benchmark table
| Benchmark | Gemini 2.0 Flash | gpt-oss-20b |
|---|---|---|
Chatbot Arena Elo · Overall | 1360.0 | 1317.7 |
HELM · GPQA | 55.6 | 59.4 |
HELM · IFEval | 84.1 | 73.2 |
HELM · MMLU-Pro | 73.7 | 74.0 |
HELM · Omni-MATH | 45.9 | 56.5 |
HELM · WildBench | 80.0 | 73.7 |
Pricing · per 1M tokens · projected $/mo at 10M tokens
| Model | Input | Output | Context | Projected $/mo |
|---|---|---|---|---|
| $0.10 | $0.40 | 1.0M tokens (~500 books) | $1.75 | |
| $0.03 | $0.14 | 131K tokens (~66 books) | $0.57 |