Compare · ModelsLive · 3 picked · head to head
GLM 5.1 vs Gemma 4 31B vs GPT-5.1-Codex-Max
Side by side · benchmarks, pricing, and signals you can act on.
Winner summary
GPT-5.1-Codex-Max wins on 5/8 benchmarks
GPT-5.1-Codex-Max wins 5 of 8 shared benchmarks. Leads in coding · knowledge.
Category leads
coding·GPT-5.1-Codex-Maxreasoning·GLM 5.1language·GLM 5.1math·GLM 5.1knowledge·GPT-5.1-Codex-Max
Hype vs Reality
Attention vs performance
GLM 5.1
#16 by perf·no signal
Gemma 4 31B
#33 by perf·no signal
GPT-5.1-Codex-Max
#12 by perf·no signal
Best value
Gemma 4 31B
7.8x better value than GLM 5.1
GLM 5.1
30.9 pts/$
$2.27/M
Gemma 4 31B
241.6 pts/$
$0.26/M
GPT-5.1-Codex-Max
12.8 pts/$
$5.63/M
Vendor risk
Who is behind the model
z-ai
private · undisclosed
Google DeepMind
$4.00T·Tier 1
OpenAI
$840.0B·Tier 1
Head to head
8 benchmarks · 3 models
GLM 5.1Gemma 4 31BGPT-5.1-Codex-Max
LiveBench · Agentic Coding
GPT-5.1-Codex-Max leads by +1.7
GLM 5.1
55.0
Gemma 4 31B
40.0
GPT-5.1-Codex-Max
56.7
LiveBench · Coding
GPT-5.1-Codex-Max leads by +6.0
GLM 5.1
75.4
Gemma 4 31B
60.3
GPT-5.1-Codex-Max
81.4
LiveBench · Data Analysis
GLM 5.1 leads by +4.5
GLM 5.1
63.2
Gemma 4 31B
58.8
GPT-5.1-Codex-Max
54.9
LiveBench · If
GLM 5.1 leads by +0.9
GLM 5.1
68.5
Gemma 4 31B
67.6
GPT-5.1-Codex-Max
67.1
LiveBench · Language
GPT-5.1-Codex-Max leads by +3.6
GLM 5.1
71.8
Gemma 4 31B
71.3
GPT-5.1-Codex-Max
75.4
LiveBench · Mathematics
GLM 5.1 leads by +1.2
GLM 5.1
84.9
Gemma 4 31B
73.9
GPT-5.1-Codex-Max
83.7
LiveBench · Overall
GPT-5.1-Codex-Max leads by +1.8
GLM 5.1
70.2
Gemma 4 31B
61.6
GPT-5.1-Codex-Max
72.0
LiveBench · Reasoning
GPT-5.1-Codex-Max leads by +12.0
GLM 5.1
72.5
Gemma 4 31B
59.4
GPT-5.1-Codex-Max
84.6
Full benchmark table
| Benchmark | GLM 5.1 | Gemma 4 31B | GPT-5.1-Codex-Max |
|---|---|---|---|
LiveBench · Agentic Coding | 55.0 | 40.0 | 56.7 |
LiveBench · Coding | 75.4 | 60.3 | 81.4 |
LiveBench · Data Analysis | 63.2 | 58.8 | 54.9 |
LiveBench · If | 68.5 | 67.6 | 67.1 |
LiveBench · Language | 71.8 | 71.3 | 75.4 |
LiveBench · Mathematics | 84.9 | 73.9 | 83.7 |
LiveBench · Overall | 70.2 | 61.6 | 72.0 |
LiveBench · Reasoning | 72.5 | 59.4 | 84.6 |
Pricing · per 1M tokens · projected $/mo at 10M tokens
| Model | Input | Output | Context | Projected $/mo |
|---|---|---|---|---|
| $1.05 | $3.50 | 203K tokens (~101 books) | $16.63 | |
| $0.13 | $0.38 | 262K tokens (~131 books) | $1.93 | |
| $1.25 | $10.00 | 400K tokens (~200 books) | $34.38 |