Home/Models/Qwen2.5-Max
Alibaba Qwen logo

Qwen2.5-Max

by Alibaba Qwen · Released Jan 2024

Open Source
36.0
avg score
Rank #161
Compare
Better than 31% of all models
Context
N/A
Input $/1M
TBD
Output $/1M
TBD
Type
text
License
Open Source
Benchmarks
8 tested
Data updated today
About

Tested on 8 benchmarks with 41.0% average. Top scores: Chatbot Arena Elo — Overall (1374.2%), Lech Mazur Writing (72.9%), MATH level 5 (67.2%).

Capabilities
coding
21.8
#124 globally
math
28.1
#138 globally
knowledge
60.4
#47 globally
Benchmark Scores
Compare All
Tested on 8 benchmarks · Ranked across 4 categories
Score Distribution (all 233 models)
0255075100
▲ You are here
Aider polyglot

Multi-language code editing from Aider. Tests editing ability across Python, JavaScript, TypeScript, Java, C++, Go, Rust, and more.

21.8
MATH level 5

Competition-level math from AMC, AIME, and olympiad problems. Level 5 is the hardest tier, requiring creative problem-solving.

67.2
OTIS Mock AIME 2024-2025

Mock AIME (American Invitational Mathematics Exam) problems from OTIS. Tests mathematical competition performance.

16.0
FrontierMath-2025-02-28-Private

Original research-level math problems created by professional mathematicians. Problems are unpublished and cannot be memorized.

1.0
Lech Mazur Writing

Writing quality evaluation by Lech Mazur. Tests prose quality, coherence, and stylistic ability.

72.9
Fiction.LiveBench

LiveBench fiction analysis. Tests literary comprehension and creative text understanding.

66.7
GPQA diamond

Graduate-level science questions written by PhD experts. Diamond subset contains questions where experts disagree, testing deep understanding.

41.5
Excellent (85+) Good (70-85) Average (50-70) Below (<50)
Links
Documentation
Community
BenchGecko API
qwen2-5-max
Specifications
  • Typetext
  • ContextN/A
  • ReleasedJan 2024
  • LicenseOpen Source
  • Statusbenchmark-only
Available On
Alibaba Qwen logoAlibaba QwenTBD
Share & Export
Tweet
Qwen2.5-Max is an open-source text AI model by Alibaba Qwen, released in January 2024. It has an average benchmark score of 36.0.