Compare · ModelsLive · 2 picked · head to head

Kimi K2 0711 vs DeepSeek V3

Side by side · benchmarks, pricing, and signals you can act on.

CiteAdd another

Winner summary

Kimi K2 0711 wins on 10/10 benchmarks

Kimi K2 0711 wins 10 of 10 shared benchmarks. Leads in coding · knowledge · language.

Category leads

coding·Kimi K2 0711knowledge·Kimi K2 0711language·Kimi K2 0711math·Kimi K2 0711reasoning·Kimi K2 0711

Hype vs Reality

Attention vs performance

Kimi K2 0711

#63 by perf·no signal

QUIET

DeepSeek V3

#45 by perf·no signal

QUIET

See full mindshare →

Best value

DeepSeek V3

2.5x better value than Kimi K2 0711

Kimi K2 0711

39.2 pts/$

$1.43/M

DeepSeek V3

97.5 pts/$

$0.60/M

Explore pricing →

Vendor risk

Mixed exposure

One or more vendors flagged

moonshotai

private · undisclosed

Unknown

DeepSeek

$3.4B·Tier 1

Higher risk

See the AI economy →

Head to head

10 benchmarks · 2 models

Kimi K2 0711DeepSeek V3

Aider polyglot

Kimi K2 0711 leads by +10.7

Aider Polyglot · measures how well AI models can edit code across multiple programming languages using the Aider coding assistant framework.

Kimi K2 0711

59.1

DeepSeek V3

48.4

Fiction.LiveBench

Kimi K2 0711 leads by +11.1

Fiction.LiveBench · a continuously updated benchmark using recently published fiction to test reading comprehension and reasoning, preventing data contamination.

Kimi K2 0711

61.1

DeepSeek V3

50.0

HELM · GPQA

Kimi K2 0711 leads by +11.4

Kimi K2 0711

65.2

DeepSeek V3

53.8

HELM · IFEval

Kimi K2 0711 leads by +1.8

Kimi K2 0711

85.0

DeepSeek V3

83.2

HELM · MMLU-Pro

Kimi K2 0711 leads by +9.6

Kimi K2 0711

81.9

DeepSeek V3

72.3

HELM · Omni-MATH

Kimi K2 0711 leads by +25.1

Kimi K2 0711

65.4

DeepSeek V3

40.3

HELM · WildBench

Kimi K2 0711 leads by +3.1

Kimi K2 0711

86.2

DeepSeek V3

83.1

Lech Mazur Writing

Kimi K2 0711 leads by +9.9

Lech Mazur Writing · evaluates creative writing ability, assessing prose quality, narrative coherence, and stylistic sophistication.

Kimi K2 0711

86.9

DeepSeek V3

77.0

SimpleBench

Kimi K2 0711 leads by +8.9

SimpleBench · tests fundamental reasoning capabilities with straightforward problems designed to expose gaps in basic logical and spatial thinking.

Kimi K2 0711

11.6

DeepSeek V3

2.7

WeirdML

Kimi K2 0711 leads by +3.3

WeirdML · tests models on unusual and adversarial machine learning tasks that require creative problem-solving beyond standard patterns.

Kimi K2 0711

39.4

DeepSeek V3

36.1

Full benchmark table

Benchmark	Kimi K2 0711	DeepSeek V3
Aider polyglot Aider Polyglot · measures how well AI models can edit code across multiple programming languages using the Aider coding assistant framework.	59.1	48.4
Fiction.LiveBench Fiction.LiveBench · a continuously updated benchmark using recently published fiction to test reading comprehension and reasoning, preventing data contamination.	61.1	50.0
HELM · GPQA	65.2	53.8
HELM · IFEval	85.0	83.2
HELM · MMLU-Pro	81.9	72.3
HELM · Omni-MATH	65.4	40.3
HELM · WildBench	86.2	83.1
Lech Mazur Writing Lech Mazur Writing · evaluates creative writing ability, assessing prose quality, narrative coherence, and stylistic sophistication.	86.9	77.0
SimpleBench SimpleBench · tests fundamental reasoning capabilities with straightforward problems designed to expose gaps in basic logical and spatial thinking.	11.6	2.7
WeirdML WeirdML · tests models on unusual and adversarial machine learning tasks that require creative problem-solving beyond standard patterns.	39.4	36.1

Pricing · per 1M tokens · projected $/mo at 10M tokens

Model	Input	Output	Context	Projected $/mo
Kimi K2 0711	$0.57	$2.30	131K tokens (~66 books)	$10.03
DeepSeek V3	$0.32	$0.89	164K tokens (~82 books)	$4.63

People also compared

DeepSeek V3 vs GPT-4o DeepSeek V3 vs Qwen2.5 Coder 32B Instruct GPT-5.5 Pro vs Kimi K2 0711 GPT-5.5 vs Kimi K2 0711 Claude Mythos Preview vs Kimi K2 0711 DeepSeek V3 vs DeepSeek V3.2 Speciale DeepSeek V3 vs DeepSeek-V2 (MoE-236B, May 2024)Kimi K2 0711 vs Qwen3.5 397B A17B