19.6
avg score
Rank #207
Better than 11% of all models
Context
131K tokens (~66 books)
Input $/1M
$0.15
Output $/1M
$0.58
Type
text
License
Open Source
Benchmarks
8 tested
Data updated today
About
QwQ is the reasoning model of the Qwen series. Compared with conventional instruction-tuned models, QwQ, which is capable of thinking and reasoning, can achieve significantly enhanced performance in downstream tasks,...
Tested on 8 benchmarks with 13.5% average. Top scores: Chatbot Arena Elo — Overall (1335.7%), IFEval (39.8%), Aider polyglot (20.9%).
Looking for similar performance at lower cost?
Llama 3.2 1B Instruct scores 19.9 (102% as good) at $0.03/1M input · 82% cheaper
Llama 3.2 1B Instruct scores 19.9 (102% as good) at $0.03/1M input · 82% cheaper
Capabilities
coding
20.9
#125 globally
reasoning
11.1
#130 globally
math
16.1
#180 globally
knowledge
1.8
#226 globally
general
2.9
#68 globally
language
39.8
#113 globally
Benchmark Scores
Compare AllTested on 8 benchmarks · Ranked across 7 categories
Score Distribution (all 233 models)
0255075100
▲ You are here
codingCompare coding →
Aider polyglot
20.9—Multi-language code editing from Aider. Tests editing ability across Python, JavaScript, TypeScript, Java, C++, Go, Rust, and more.
reasoningCompare reasoning →
MUSR
11.1—HuggingFace MuSR (Multi-Step Reasoning). Tests multi-hop reasoning requiring chaining multiple facts together.
mathCompare math →
MATH Level 5
16.1—HuggingFace evaluation of MATH Level 5 problems. Competition math requiring advanced reasoning and proof construction.
Excellent (85+) Good (70-85) Average (50-70) Below (<50)
Links
Research
Documentation
Community
Source Code
BenchGecko API
qwq-32b
Specifications
- Typetext
- Context131K tokens (~66 books)
- ReleasedMar 2025
- LicenseOpen Source
- StatusActive
- Cost / Message~$0.001
Available On
Share & Export
Frequently Asked Questions
QwQ 32B is an open-source text AI model by Alibaba Qwen, released in March 2025. It has an average benchmark score of 19.6. Context window: 131K tokens.