How much does QwQ 32B cost?

QwQ 32B costs $0.15 per million input tokens and $0.58 per million output tokens. For a typical conversation (~2,000 tokens), that's approximately $0.001 per message.

What benchmarks has QwQ 32B been tested on?

QwQ 32B has been evaluated on 8 benchmarks. Top scores: Chatbot Arena Elo — Overall: 1335.7, IFEval: 39.8, Aider polyglot: 20.9.

How does QwQ 32B compare to Llama 3.2 1B Instruct?

QwQ 32B has an average score of 19.6 while Llama 3.2 1B Instruct scores 19.9. Llama 3.2 1B Instruct slightly outperforms QwQ 32B overall. QwQ 32B costs $0.15/1M input vs Llama 3.2 1B Instruct at $0.03/1M input. See full comparison →

Home/Models/QwQ 32B

QwQ 32B

Name: QwQ 32B
Price: 0.15 USD
Author: Alibaba Qwen

by Alibaba Qwen · Released Mar 2025

Open Source

19.6

avg score

Rank #246

Compare

Better than 10% of all models

Context

131K tokens (~66 books)

Input $/1M

$0.15

Output $/1M

$0.58

Type

text

License

Open Source

Benchmarks

8 tested

Data updated today

About

QwQ is the reasoning model of the Qwen series. Compared with conventional instruction-tuned models, QwQ, which is capable of thinking and reasoning, can achieve significantly enhanced performance in downstream tasks,...

Tested on 8 benchmarks with 13.5% average. Top scores: Chatbot Arena Elo — Overall (1335.7%), IFEval (39.8%), Aider polyglot (20.9%).

Looking for similar performance at lower cost?
Llama 3.2 1B Instruct scores 19.9 (102% as good) at $0.03/1M input · 82% cheaper

Capabilities

coding

20.9

#147 globally

reasoning

11.1

#154 globally

math

16.1

#213 globally

knowledge

1.8

#265 globally

general

2.9

#73 globally

language

39.8

#131 globally

Benchmark Scores

Compare All

Tested on 8 benchmarks · Ranked across 7 categories

Score Distribution (all 274 models)

0255075100

▲ You are here

codingCompare coding →

Aider polyglot

Multi-language code editing from Aider. Tests editing ability across Python, JavaScript, TypeScript, Java, C++, Go, Rust, and more.

20.9—

reasoningCompare reasoning →

MUSR

HuggingFace MuSR (Multi-Step Reasoning). Tests multi-hop reasoning requiring chaining multiple facts together.

11.1—

mathCompare math →

MATH Level 5

HuggingFace evaluation of MATH Level 5 problems. Competition math requiring advanced reasoning and proof construction.

16.1—

Quick compare:

vs Llama 3.2 1B Instruct

vs Vicuna 7b V1.5

vs Qwen3 4B Instruct 2507

Excellent (85+) Good (70-85) Average (50-70) Below (<50)

Similar Models

Llama 3.2 1B Instruct

Frequently Asked Questions

QwQ 32B is an open-source text AI model by Alibaba Qwen, released in March 2025. It has an average benchmark score of 19.6. Context window: 131K tokens.

Benchmarks

Chatbot Arena Elo — Overall IFEval Aider polyglot MATH Level 5 MUSR

Alibaba Qwen · Provider Alibaba Qwen · Economy All Models Compare Models Pricing Developers · API Context Window · Glossary

QwQ 32B

Frequently Asked Questions

Related Models

Benchmarks

Related Pages