Compare · ModelsLive · 2 picked · head to head
o1 vs DeepSeek V3.1
Side by side · benchmarks, pricing, and signals you can act on.
Winner summary
o1 wins on 3/4 benchmarks
o1 wins 3 of 4 shared benchmarks. Leads in knowledge · reasoning · coding.
Category leads
knowledge·o1reasoning·o1coding·o1
Hype vs Reality
Attention vs performance
o1
#59 by perf·no signal
DeepSeek V3.1
#88 by perf·no signal
Best value
DeepSeek V3.1
75.5x better value than o1
o1
1.5 pts/$
$37.50/M
DeepSeek V3.1
113.6 pts/$
$0.45/M
Vendor risk
Mixed exposure
One or more vendors flagged
OpenAI
$840.0B·Tier 1
DeepSeek
$3.4B·Tier 1
Head to head
4 benchmarks · 2 models
o1DeepSeek V3.1
Fiction.LiveBench
o1 leads by +30.5
Fiction.LiveBench · a continuously updated benchmark using recently published fiction to test reading comprehension and reasoning, preventing data contamination.
o1
83.3
DeepSeek V3.1
52.8
Lech Mazur Writing
DeepSeek V3.1 leads by +15.0
Lech Mazur Writing · evaluates creative writing ability, assessing prose quality, narrative coherence, and stylistic sophistication.
o1
70.2
DeepSeek V3.1
85.2
SimpleBench
o1 leads by +0.1
SimpleBench · tests fundamental reasoning capabilities with straightforward problems designed to expose gaps in basic logical and spatial thinking.
o1
28.1
DeepSeek V3.1
28.0
WeirdML
o1 leads by +5.5
WeirdML · tests models on unusual and adversarial machine learning tasks that require creative problem-solving beyond standard patterns.
o1
43.8
DeepSeek V3.1
38.4
Full benchmark table
| Benchmark | o1 | DeepSeek V3.1 |
|---|---|---|
Fiction.LiveBench Fiction.LiveBench · a continuously updated benchmark using recently published fiction to test reading comprehension and reasoning, preventing data contamination. | 83.3 | 52.8 |
Lech Mazur Writing Lech Mazur Writing · evaluates creative writing ability, assessing prose quality, narrative coherence, and stylistic sophistication. | 70.2 | 85.2 |
SimpleBench SimpleBench · tests fundamental reasoning capabilities with straightforward problems designed to expose gaps in basic logical and spatial thinking. | 28.1 | 28.0 |
WeirdML WeirdML · tests models on unusual and adversarial machine learning tasks that require creative problem-solving beyond standard patterns. | 43.8 | 38.4 |
Pricing · per 1M tokens · projected $/mo at 10M tokens
| Model | Input | Output | Context | Projected $/mo |
|---|---|---|---|---|
| $15.00 | $60.00 | 200K tokens (~100 books) | $262.50 | |
| $0.15 | $0.75 | 33K tokens (~16 books) | $3.00 |
People also compared