Compare · ModelsLive · 2 picked · head to head
Qwen3.5 397B A17B vs Step 3.5 Flash
Side by side · benchmarks, pricing, and signals you can act on.
Winner summary
Qwen3.5 397B A17B wins on 7/10 benchmarks
Qwen3.5 397B A17B wins 7 of 10 shared benchmarks. Leads in speed · arena · knowledge.
Category leads
speed·Qwen3.5 397B A17Barena·Qwen3.5 397B A17Bmath·Step 3.5 Flashknowledge·Qwen3.5 397B A17Blanguage·Step 3.5 Flashcoding·Step 3.5 Flash
Hype vs Reality
Attention vs performance
Qwen3.5 397B A17B
#3 by perf·no signal
Step 3.5 Flash
#7 by perf·#11 by attention
Best value
Step 3.5 Flash
6.7x better value than Qwen3.5 397B A17B
Qwen3.5 397B A17B
57.4 pts/$
$1.36/M
Step 3.5 Flash
384.5 pts/$
$0.20/M
Vendor risk
Mixed exposure
One or more vendors flagged
Alibaba (Qwen)
$293.0B·Tier 1
StepFun
$5.0B·Tier 1
Head to head
10 benchmarks · 2 models
Qwen3.5 397B A17BStep 3.5 Flash
Artificial Analysis · Agentic Index
Qwen3.5 397B A17B leads by +3.8
Qwen3.5 397B A17B
55.8
Step 3.5 Flash
52.0
Artificial Analysis · Coding Index
Qwen3.5 397B A17B leads by +9.6
Qwen3.5 397B A17B
41.3
Step 3.5 Flash
31.6
Artificial Analysis · Quality Index
Qwen3.5 397B A17B leads by +7.3
Qwen3.5 397B A17B
45.0
Step 3.5 Flash
37.8
Chatbot Arena Elo · Overall
Qwen3.5 397B A17B leads by +56.3
Qwen3.5 397B A17B
1447.7
Step 3.5 Flash
1391.4
OpenCompass · AIME2025
Step 3.5 Flash leads by +3.4
Qwen3.5 397B A17B
92.3
Step 3.5 Flash
95.7
OpenCompass · GPQA-Diamond
Qwen3.5 397B A17B leads by +4.7
Qwen3.5 397B A17B
88.4
Step 3.5 Flash
83.7
OpenCompass · HLE
Qwen3.5 397B A17B leads by +5.9
Qwen3.5 397B A17B
27.5
Step 3.5 Flash
21.6
OpenCompass · IFEval
Step 3.5 Flash leads by +1.7
Qwen3.5 397B A17B
91.5
Step 3.5 Flash
93.2
OpenCompass · LiveCodeBenchV6
Step 3.5 Flash leads by +0.9
Qwen3.5 397B A17B
83.0
Step 3.5 Flash
83.9
OpenCompass · MMLU-Pro
Qwen3.5 397B A17B leads by +4.1
Qwen3.5 397B A17B
87.6
Step 3.5 Flash
83.5
Full benchmark table
| Benchmark | Qwen3.5 397B A17B | Step 3.5 Flash |
|---|---|---|
Artificial Analysis · Agentic Index | 55.8 | 52.0 |
Artificial Analysis · Coding Index | 41.3 | 31.6 |
Artificial Analysis · Quality Index | 45.0 | 37.8 |
Chatbot Arena Elo · Overall | 1447.7 | 1391.4 |
OpenCompass · AIME2025 | 92.3 | 95.7 |
OpenCompass · GPQA-Diamond | 88.4 | 83.7 |
OpenCompass · HLE | 27.5 | 21.6 |
OpenCompass · IFEval | 91.5 | 93.2 |
OpenCompass · LiveCodeBenchV6 | 83.0 | 83.9 |
OpenCompass · MMLU-Pro | 87.6 | 83.5 |
Pricing · per 1M tokens · projected $/mo at 10M tokens
| Model | Input | Output | Context | Projected $/mo |
|---|---|---|---|---|
| $0.39 | $2.34 | 262K tokens (~131 books) | $8.78 | |
| $0.10 | $0.30 | 262K tokens (~131 books) | $1.50 |
People also compared
GPT-5 Chat vs Qwen3.5 397B A17BClaude Mythos Preview vs Qwen3.5 397B A17BGPT-5 Chat vs Step 3.5 FlashClaude Mythos Preview vs Step 3.5 FlashDeepSeek V3.2 Speciale vs Qwen3.5 397B A17BClaude Instant vs Qwen3.5 397B A17BDeepSeek V3.2 Speciale vs Step 3.5 FlashDeepSeek-V2 (MoE-236B, May 2024) vs Qwen3.5 397B A17B