62.3
avg score
Rank #58
Better than 75% of all models
Context
N/A
Input $/1M
TBD
Output $/1M
TBD
Type
text
License
Open Source
Benchmarks
12 tested
Data updated today
About
Tested on 12 benchmarks with 41.3% average. Top scores: CMMLU (89.7%), MMLU (76.5%), Aider — Code Editing (55.6%).
Capabilities
coding
55.6
#51 globally
reasoning
19.7
#102 globally
math
35.1
#119 globally
knowledge
51.8
#91 globally
agentic
1.1
#38 globally
language
38.2
#116 globally
general
51.9
#9 globally
Benchmark Scores
Compare AllTested on 12 benchmarks · Ranked across 7 categories
Score Distribution (all 233 models)
0255075100
▲ You are here
codingCompare coding →
Aider — Code Editing
55.6—Code editing benchmark from the Aider project. Measures ability to apply targeted code changes while maintaining correctness and style.
reasoningCompare reasoning →
MUSR
19.7—HuggingFace MuSR (Multi-Step Reasoning). Tests multi-hop reasoning requiring chaining multiple facts together.
mathCompare math →
MATH level 5
39.1—Competition-level math from AMC, AIME, and olympiad problems. Level 5 is the hardest tier, requiring creative problem-solving.
MATH Level 5
31.1—HuggingFace evaluation of MATH Level 5 problems. Competition math requiring advanced reasoning and proof construction.
Excellent (85+) Good (70-85) Average (50-70) Below (<50)
Links
Research
Documentation
Community
Source Code
BenchGecko API
qwen2-72b
Specifications
- Typetext
- ContextN/A
- ReleasedJan 2024
- LicenseOpen Source
- Statusbenchmark-only
Available On
Learn More
Share & Export
Frequently Asked Questions
Qwen2-72B is an open-source text AI model by Alibaba Qwen, released in January 2024. It has an average benchmark score of 62.3.