Beta
Compare · ModelsLive · 2 picked · head to head

Gemma 4 31B vs GPT-5.1-Codex-Max

Side by side · benchmarks, pricing, and signals you can act on.

Winner summary

GPT-5.1-Codex-Max wins 6 of 8 shared benchmarks. Leads in coding · math · knowledge.

Category leads
coding·GPT-5.1-Codex-Maxreasoning·Gemma 4 31Blanguage·Gemma 4 31Bmath·GPT-5.1-Codex-Maxknowledge·GPT-5.1-Codex-Max
Hype vs Reality
Gemma 4 31B
#31 by perf·no signal
QUIET
GPT-5.1-Codex-Max
#10 by perf·no signal
QUIET
Best value
18.9x better value than GPT-5.1-Codex-Max
Gemma 4 31B
241.6 pts/$
$0.26/M
GPT-5.1-Codex-Max
12.8 pts/$
$5.63/M
Vendor risk
Google DeepMind logo
Google DeepMind
$4.00T·Tier 1
Low risk
OpenAI logo
OpenAI
$840.0B·Tier 1
Medium risk
Head to head
Gemma 4 31BGPT-5.1-Codex-Max
LiveBench · Agentic Coding
GPT-5.1-Codex-Max leads by +16.7
Gemma 4 31B
40.0
GPT-5.1-Codex-Max
56.7
LiveBench · Coding
GPT-5.1-Codex-Max leads by +21.0
Gemma 4 31B
60.3
GPT-5.1-Codex-Max
81.4
LiveBench · Data Analysis
Gemma 4 31B leads by +3.9
Gemma 4 31B
58.8
GPT-5.1-Codex-Max
54.9
LiveBench · If
Gemma 4 31B leads by +0.5
Gemma 4 31B
67.6
GPT-5.1-Codex-Max
67.1
LiveBench · Language
GPT-5.1-Codex-Max leads by +4.0
Gemma 4 31B
71.3
GPT-5.1-Codex-Max
75.4
LiveBench · Mathematics
GPT-5.1-Codex-Max leads by +9.7
Gemma 4 31B
73.9
GPT-5.1-Codex-Max
83.7
LiveBench · Overall
GPT-5.1-Codex-Max leads by +10.3
Gemma 4 31B
61.6
GPT-5.1-Codex-Max
72.0
LiveBench · Reasoning
GPT-5.1-Codex-Max leads by +25.1
Gemma 4 31B
59.4
GPT-5.1-Codex-Max
84.6
Full benchmark table
BenchmarkGemma 4 31BGPT-5.1-Codex-Max
LiveBench · Agentic Coding
40.056.7
LiveBench · Coding
60.381.4
LiveBench · Data Analysis
58.854.9
LiveBench · If
67.667.1
LiveBench · Language
71.375.4
LiveBench · Mathematics
73.983.7
LiveBench · Overall
61.672.0
LiveBench · Reasoning
59.484.6
Pricing · per 1M tokens · projected $/mo at 10M tokens
ModelInputOutputContextProjected $/mo
Google DeepMind logoGemma 4 31B$0.13$0.38262K tokens (~131 books)$1.93
OpenAI logoGPT-5.1-Codex-Max$1.25$10.00400K tokens (~200 books)$34.38