Compare · ModelsLive · 2 picked · head to head

Kimi K2 Thinking vs Qwen3 Max

Side by side · benchmarks, pricing, and signals you can act on.

CiteAdd another

Winner summary

Kimi K2 Thinking wins on 3/5 benchmarks

Kimi K2 Thinking wins 3 of 5 shared benchmarks. Leads in knowledge · math.

Category leads

knowledge·Kimi K2 Thinkingmath·Kimi K2 Thinking

Hype vs Reality

Attention vs performance

Kimi K2 Thinking

#79 by perf·no signal

QUIET

Qwen3 Max

#49 by perf·no signal

QUIET

See full mindshare →

Best value

Kimi K2 Thinking

1.4x better value than Qwen3 Max

Kimi K2 Thinking

34.4 pts/$

$1.55/M

Qwen3 Max

24.9 pts/$

$2.34/M

Explore pricing →

Vendor risk

Who is behind the model

moonshotai

private · undisclosed

Unknown

Alibaba (Qwen)

$293.0B·Tier 1

Low risk

See the AI economy →

Head to head

5 benchmarks · 2 models

Kimi K2 ThinkingQwen3 Max

Chess Puzzles

Kimi K2 Thinking leads by +16.0

Chess Puzzles · tests strategic and tactical reasoning by having models solve chess puzzle positions, evaluating lookahead and pattern recognition abilities.

Kimi K2 Thinking

20.0

Qwen3 Max

4.0

GPQA diamond

Kimi K2 Thinking leads by +15.5

Graduate-Level Google-Proof QA (Diamond set) · expert-crafted questions in physics, biology, and chemistry that are difficult even for domain PhDs.

Kimi K2 Thinking

79.0

Qwen3 Max

63.5

OTIS Mock AIME 2024-2025

Kimi K2 Thinking leads by +9.7

OTIS Mock AIME 2024-2025 · simulated American Invitational Mathematics Examination problems testing advanced problem-solving skills.

Kimi K2 Thinking

83.0

Qwen3 Max

73.3

PostTrainBench

Qwen3 Max leads by +0.2

Kimi K2 Thinking

7.3

Qwen3 Max

7.4

SimpleQA Verified

Qwen3 Max leads by +35.9

SimpleQA Verified · short factual questions with verified answers, measuring factual accuracy and the tendency to hallucinate or provide incorrect information.

Kimi K2 Thinking

31.6

Qwen3 Max

67.5

Full benchmark table

Benchmark	Kimi K2 Thinking	Qwen3 Max
Chess Puzzles Chess Puzzles · tests strategic and tactical reasoning by having models solve chess puzzle positions, evaluating lookahead and pattern recognition abilities.	20.0	4.0
GPQA diamond Graduate-Level Google-Proof QA (Diamond set) · expert-crafted questions in physics, biology, and chemistry that are difficult even for domain PhDs.	79.0	63.5
OTIS Mock AIME 2024-2025 OTIS Mock AIME 2024-2025 · simulated American Invitational Mathematics Examination problems testing advanced problem-solving skills.	83.0	73.3
PostTrainBench	7.3	7.4
SimpleQA Verified SimpleQA Verified · short factual questions with verified answers, measuring factual accuracy and the tendency to hallucinate or provide incorrect information.	31.6	67.5

Pricing · per 1M tokens · projected $/mo at 10M tokens

Model	Input	Output	Context	Projected $/mo
Kimi K2 Thinking	$0.60	$2.50	262K tokens (~131 books)	$10.75
Qwen3 Max	$0.78	$3.90	262K tokens (~131 books)	$15.60

People also compared

GPT-5 vs Qwen3 Max GPT-5.5 Pro vs Kimi K2 Thinking GPT-5.5 vs Kimi K2 Thinking Claude Mythos Preview vs Kimi K2 Thinking Kimi K2 Thinking vs Qwen3.5 397B A17B DeepSeek V3.2 Speciale vs Kimi K2 Thinking Claude Instant vs Kimi K2 Thinking DeepSeek-V2 (MoE-236B, May 2024) vs Kimi K2 Thinking