Compare · ModelsLive · 3 picked · head to head

DeepSeek V3.2 Speciale vs GLM 5 vs Step 3.5 Flash

Side by side · benchmarks, pricing, and signals you can act on.

Winner summary

DeepSeek V3.2 Speciale wins 5 of 10 shared benchmarks. Leads in math · knowledge.

Category leads
math·DeepSeek V3.2 Specialeknowledge·DeepSeek V3.2 Specialelanguage·GLM 5coding·GLM 5speed·Step 3.5 Flasharena·GLM 5
Hype vs Reality
DeepSeek V3.2 Speciale
#6 by perf·#5 by attention
DESERVED
GLM 5
#55 by perf·#27 by attention
UNDERRATED
Step 3.5 Flash
#9 by perf·#11 by attention
DESERVED
Best value
3.9x better value than DeepSeek V3.2 Speciale
DeepSeek V3.2 Speciale
97.8 pts/$
$0.80/M
GLM 5
45.7 pts/$
$1.26/M
Step 3.5 Flash
384.5 pts/$
$0.20/M
Vendor risk
One or more vendors flagged
DeepSeek logo
DeepSeek
$3.4B·Tier 1
Higher risk
z-ai logo
z-ai
private · undisclosed
Unknown
stepfun logo
StepFun
$5.0B·Tier 1
Higher risk
Head to head
DeepSeek V3.2 SpecialeGLM 5Step 3.5 Flash
OpenCompass · AIME2025
DeepSeek V3.2 Speciale leads by +0.2
DeepSeek V3.2 Speciale
96.0
GLM 5
95.8
Step 3.5 Flash
95.7
OpenCompass · GPQA-Diamond
DeepSeek V3.2 Speciale leads by +1.4
DeepSeek V3.2 Speciale
86.7
GLM 5
85.3
Step 3.5 Flash
83.7
OpenCompass · HLE
DeepSeek V3.2 Speciale leads by +0.5
DeepSeek V3.2 Speciale
28.6
GLM 5
28.1
Step 3.5 Flash
21.6
OpenCompass · IFEval
DeepSeek V3.2 Speciale
91.7
GLM 5
93.2
Step 3.5 Flash
93.2
OpenCompass · LiveCodeBenchV6
GLM 5 leads by +2.3
DeepSeek V3.2 Speciale
80.9
GLM 5
86.2
Step 3.5 Flash
83.9
OpenCompass · MMLU-Pro
DeepSeek V3.2 Speciale leads by +0.3
DeepSeek V3.2 Speciale
85.5
GLM 5
85.2
Step 3.5 Flash
83.5
Artificial Analysis · Agentic Index
Step 3.5 Flash leads by +52.0
Artificial Analysis Agentic Index · a composite score measuring how well a model performs in agentic workflows · multi-step tool use, planning, error recovery, and autonomous task completion. Aggregates results from multiple agentic benchmarks including SWE-bench, tool-use tests, and planning evaluations. The canonical single-number metric for "how good is this model as an agent?"
DeepSeek V3.2 Speciale
0.0
Step 3.5 Flash
52.0
Artificial Analysis · Coding Index
DeepSeek V3.2 Speciale leads by +6.3
Artificial Analysis Coding Index · a composite score that aggregates performance across multiple coding benchmarks into a single index. Tracks code generation quality, debugging ability, multi-language competence, and real-world software engineering tasks. Used by Artificial Analysis to rank model coding capability in a normalized, comparable format. Useful for developers choosing between models for coding-heavy workloads.
DeepSeek V3.2 Speciale
37.9
Step 3.5 Flash
31.6
Artificial Analysis · Quality Index
Step 3.5 Flash leads by +8.4
DeepSeek V3.2 Speciale
29.4
Step 3.5 Flash
37.8
Chatbot Arena Elo · Overall
GLM 5 leads by +64.2
GLM 5
1455.6
Step 3.5 Flash
1391.4
Full benchmark table
BenchmarkDeepSeek V3.2 SpecialeGLM 5Step 3.5 Flash
OpenCompass · AIME2025
96.095.895.7
OpenCompass · GPQA-Diamond
86.785.383.7
OpenCompass · HLE
28.628.121.6
OpenCompass · IFEval
91.793.293.2
OpenCompass · LiveCodeBenchV6
80.986.283.9
OpenCompass · MMLU-Pro
85.585.283.5
Artificial Analysis · Agentic Index
Artificial Analysis Agentic Index · a composite score measuring how well a model performs in agentic workflows · multi-step tool use, planning, error recovery, and autonomous task completion. Aggregates results from multiple agentic benchmarks including SWE-bench, tool-use tests, and planning evaluations. The canonical single-number metric for "how good is this model as an agent?"
0.052.0
Artificial Analysis · Coding Index
Artificial Analysis Coding Index · a composite score that aggregates performance across multiple coding benchmarks into a single index. Tracks code generation quality, debugging ability, multi-language competence, and real-world software engineering tasks. Used by Artificial Analysis to rank model coding capability in a normalized, comparable format. Useful for developers choosing between models for coding-heavy workloads.
37.931.6
Artificial Analysis · Quality Index
29.437.8
Chatbot Arena Elo · Overall
1455.61391.4
Pricing · per 1M tokens · projected $/mo at 10M tokens
ModelInputOutputContextProjected $/mo
DeepSeek logoDeepSeek V3.2 Speciale$0.40$1.20164K tokens (~82 books)$6.00
z-ai logoGLM 5$0.60$1.92203K tokens (~101 books)$9.30
stepfun logoStep 3.5 Flash$0.10$0.30262K tokens (~131 books)$1.50