Compare · ModelsLive · 2 picked · head to head

Kimi K2 0711 vs Qwen3 235B A22B

Side by side · benchmarks, pricing, and signals you can act on.

CiteAdd another

Winner summary

Qwen3 235B A22B wins on 3/5 benchmarks

Qwen3 235B A22B wins 3 of 5 shared benchmarks. Leads in coding · knowledge · reasoning.

Category leads

coding·Qwen3 235B A22Bknowledge·Qwen3 235B A22Breasoning·Qwen3 235B A22B

Hype vs Reality

Attention vs performance

Kimi K2 0711

#63 by perf·no signal

QUIET

Qwen3 235B A22B

#60 by perf·no signal

QUIET

See full mindshare →

Best value

Qwen3 235B A22B

1.3x better value than Kimi K2 0711

Kimi K2 0711

39.2 pts/$

$1.43/M

Qwen3 235B A22B

49.6 pts/$

$1.14/M

Explore pricing →

Vendor risk

Who is behind the model

moonshotai

private · undisclosed

Unknown

Alibaba (Qwen)

$293.0B·Tier 1

Low risk

See the AI economy →

Head to head

5 benchmarks · 2 models

Kimi K2 0711Qwen3 235B A22B

Aider polyglot

Qwen3 235B A22B leads by +0.5

Aider Polyglot · measures how well AI models can edit code across multiple programming languages using the Aider coding assistant framework.

Kimi K2 0711

59.1

Qwen3 235B A22B

59.6

Fiction.LiveBench

Qwen3 235B A22B leads by +6.6

Fiction.LiveBench · a continuously updated benchmark using recently published fiction to test reading comprehension and reasoning, preventing data contamination.

Kimi K2 0711

61.1

Qwen3 235B A22B

67.7

Lech Mazur Writing

Kimi K2 0711 leads by +3.9

Lech Mazur Writing · evaluates creative writing ability, assessing prose quality, narrative coherence, and stylistic sophistication.

Kimi K2 0711

86.9

Qwen3 235B A22B

83.0

SimpleBench

Qwen3 235B A22B leads by +5.6

SimpleBench · tests fundamental reasoning capabilities with straightforward problems designed to expose gaps in basic logical and spatial thinking.

Kimi K2 0711

11.6

Qwen3 235B A22B

17.2

WeirdML

Kimi K2 0711 leads by +2.1

WeirdML · tests models on unusual and adversarial machine learning tasks that require creative problem-solving beyond standard patterns.

Kimi K2 0711

39.4

Qwen3 235B A22B

37.3

Full benchmark table

Benchmark	Kimi K2 0711	Qwen3 235B A22B
Aider polyglot Aider Polyglot · measures how well AI models can edit code across multiple programming languages using the Aider coding assistant framework.	59.1	59.6
Fiction.LiveBench Fiction.LiveBench · a continuously updated benchmark using recently published fiction to test reading comprehension and reasoning, preventing data contamination.	61.1	67.7
Lech Mazur Writing Lech Mazur Writing · evaluates creative writing ability, assessing prose quality, narrative coherence, and stylistic sophistication.	86.9	83.0
SimpleBench SimpleBench · tests fundamental reasoning capabilities with straightforward problems designed to expose gaps in basic logical and spatial thinking.	11.6	17.2
WeirdML WeirdML · tests models on unusual and adversarial machine learning tasks that require creative problem-solving beyond standard patterns.	39.4	37.3

Pricing · per 1M tokens · projected $/mo at 10M tokens

Model	Input	Output	Context	Projected $/mo
Kimi K2 0711	$0.57	$2.30	131K tokens (~66 books)	$10.03
Qwen3 235B A22B	$0.46	$1.82	131K tokens (~66 books)	$7.96

People also compared

GPT-5 Mini vs Qwen3 235B A22B GPT-5.5 Pro vs Kimi K2 0711 GPT-5.5 vs Kimi K2 0711 Claude Mythos Preview vs Kimi K2 0711 Kimi K2 0711 vs Qwen3.5 397B A17B DeepSeek V3.2 Speciale vs Kimi K2 0711 Claude Instant vs Kimi K2 0711 DeepSeek-V2 (MoE-236B, May 2024) vs Kimi K2 0711