Compare · ModelsLive · 2 picked · head to head

Qwen2-72B vs Qwen2.5 72B Instruct

Side by side · benchmarks, pricing, and signals you can act on.

CiteAdd another

Winner summary

Qwen2.5 72B Instruct wins on 8/12 benchmarks

Qwen2.5 72B Instruct wins 8 of 12 shared benchmarks. Leads in coding · general · language.

Category leads

coding·Qwen2.5 72B Instructknowledge·Qwen2-72Bgeneral·Qwen2.5 72B Instructlanguage·Qwen2.5 72B Instructmath·Qwen2.5 72B Instructreasoning·Qwen2-72Bagentic·Qwen2.5 72B Instruct

Hype vs Reality

Attention vs performance

Qwen2-72B

#137 by perf·no signal

QUIET

Qwen2.5 72B Instruct

#80 by perf·no signal

QUIET

See full mindshare →

Best value

Qwen2.5 72B Instruct

Qwen2-72B

—

no price

Qwen2.5 72B Instruct

208.6 pts/$

$0.26/M

Explore pricing →

Vendor risk

Who is behind the model

Alibaba (Qwen)

$293.0B·Tier 1

Low risk

Alibaba (Qwen)

$293.0B·Tier 1

Low risk

See the AI economy →

Head to head

12 benchmarks · 2 models

Qwen2-72BQwen2.5 72B Instruct

Aider · Code Editing

Qwen2.5 72B Instruct leads by +9.8

Qwen2-72B

55.6

Qwen2.5 72B Instruct

65.4

CMMLU

Qwen2-72B leads by +4.0

Qwen2-72B

89.7

Qwen2.5 72B Instruct

85.7

GPQA diamond

Qwen2.5 72B Instruct leads by +11.2

Graduate-Level Google-Proof QA (Diamond set) · expert-crafted questions in physics, biology, and chemistry that are difficult even for domain PhDs.

Qwen2-72B

21.0

Qwen2.5 72B Instruct

32.2

BBH (HuggingFace)

Qwen2.5 72B Instruct leads by +10.0

Qwen2-72B

51.9

Qwen2.5 72B Instruct

61.9

GPQA

Qwen2-72B leads by +2.6

Qwen2-72B

19.2

Qwen2.5 72B Instruct

16.7

IFEval

Qwen2.5 72B Instruct leads by +48.1

Qwen2-72B

38.2

Qwen2.5 72B Instruct

86.4

MATH Level 5

Qwen2.5 72B Instruct leads by +28.7

Qwen2-72B

31.1

Qwen2.5 72B Instruct

59.8

MMLU-PRO

Qwen2-72B leads by +1.2

Qwen2-72B

52.6

Qwen2.5 72B Instruct

51.4

MUSR

Qwen2-72B leads by +8.0

Qwen2-72B

19.7

Qwen2.5 72B Instruct

11.7

MATH level 5

Qwen2.5 72B Instruct leads by +24.1

MATH Level 5 · the hardest tier of the MATH benchmark, featuring competition-level problems from AMC, AIME, and Olympiad-style mathematics.

Qwen2-72B

39.1

Qwen2.5 72B Instruct

63.2

MMLU

Qwen2.5 72B Instruct leads by +3.9

Massive Multitask Language Understanding · 57 subjects spanning STEM, humanities, social sciences, and more. The standard benchmark for broad knowledge.

Qwen2-72B

76.5

Qwen2.5 72B Instruct

80.4

The Agent Company

Qwen2.5 72B Instruct leads by +4.6

The Agent Company · tests AI agents on realistic corporate tasks like email management, code review, data analysis, and cross-tool workflows.

Qwen2-72B

1.1

Qwen2.5 72B Instruct

5.7

Full benchmark table

Benchmark	Qwen2-72B	Qwen2.5 72B Instruct
Aider · Code Editing	55.6	65.4
CMMLU	89.7	85.7
GPQA diamond Graduate-Level Google-Proof QA (Diamond set) · expert-crafted questions in physics, biology, and chemistry that are difficult even for domain PhDs.	21.0	32.2
BBH (HuggingFace)	51.9	61.9
GPQA	19.2	16.7
IFEval	38.2	86.4
MATH Level 5	31.1	59.8
MMLU-PRO	52.6	51.4
MUSR	19.7	11.7
MATH level 5 MATH Level 5 · the hardest tier of the MATH benchmark, featuring competition-level problems from AMC, AIME, and Olympiad-style mathematics.	39.1	63.2
MMLU Massive Multitask Language Understanding · 57 subjects spanning STEM, humanities, social sciences, and more. The standard benchmark for broad knowledge.	76.5	80.4
The Agent Company The Agent Company · tests AI agents on realistic corporate tasks like email management, code review, data analysis, and cross-tool workflows.	1.1	5.7

Pricing · per 1M tokens · projected $/mo at 10M tokens

Model	Input	Output	Context	Projected $/mo
Qwen2-72B	—	—	—	—
Qwen2.5 72B Instruct	$0.12	$0.39	33K tokens (~16 books)	$1.88