Compare · ModelsLive · 2 picked · head to head
Step 3.5 Flash vs Kimi K2 Thinking
Side by side · benchmarks, pricing, and signals you can act on.
Winner summary
Step 3.5 Flash wins on 5/6 benchmarks
Step 3.5 Flash wins 5 of 6 shared benchmarks. Leads in math · knowledge · language.
Category leads
math·Step 3.5 Flashknowledge·Step 3.5 Flashlanguage·Step 3.5 Flashcoding·Step 3.5 Flash
Hype vs Reality
Attention vs performance
Step 3.5 Flash
#7 by perf·#11 by attention
Kimi K2 Thinking
#77 by perf·no signal
Best value
Step 3.5 Flash
11.2x better value than Kimi K2 Thinking
Step 3.5 Flash
384.5 pts/$
$0.20/M
Kimi K2 Thinking
34.4 pts/$
$1.55/M
Vendor risk
Mixed exposure
One or more vendors flagged
StepFun
$5.0B·Tier 1
moonshotai
private · undisclosed
Head to head
6 benchmarks · 2 models
Step 3.5 FlashKimi K2 Thinking
OpenCompass · AIME2025
Step 3.5 Flash leads by +1.6
Step 3.5 Flash
95.7
Kimi K2 Thinking
94.1
OpenCompass · GPQA-Diamond
Step 3.5 Flash leads by +1.0
Step 3.5 Flash
83.7
Kimi K2 Thinking
82.7
OpenCompass · HLE
Step 3.5 Flash leads by +0.3
Step 3.5 Flash
21.6
Kimi K2 Thinking
21.3
OpenCompass · IFEval
Step 3.5 Flash leads by +0.8
Step 3.5 Flash
93.2
Kimi K2 Thinking
92.4
OpenCompass · LiveCodeBenchV6
Step 3.5 Flash leads by +6.8
Step 3.5 Flash
83.9
Kimi K2 Thinking
77.1
OpenCompass · MMLU-Pro
Kimi K2 Thinking leads by +0.8
Step 3.5 Flash
83.5
Kimi K2 Thinking
84.3
Full benchmark table
| Benchmark | Step 3.5 Flash | Kimi K2 Thinking |
|---|---|---|
OpenCompass · AIME2025 | 95.7 | 94.1 |
OpenCompass · GPQA-Diamond | 83.7 | 82.7 |
OpenCompass · HLE | 21.6 | 21.3 |
OpenCompass · IFEval | 93.2 | 92.4 |
OpenCompass · LiveCodeBenchV6 | 83.9 | 77.1 |
OpenCompass · MMLU-Pro | 83.5 | 84.3 |
Pricing · per 1M tokens · projected $/mo at 10M tokens
| Model | Input | Output | Context | Projected $/mo |
|---|---|---|---|---|
| $0.10 | $0.30 | 262K tokens (~131 books) | $1.50 | |
| $0.60 | $2.50 | 262K tokens (~131 books) | $10.75 |
People also compared