Better than 43% of all models
Context
N/A
Input $/1M
TBD
Output $/1M
TBD
Type
text-generation
License
Proprietary
Benchmarks
6 tested
Data updated today
About
Qwen text generation model. 7844K downloads on HuggingFace.
Tested on 6 benchmarks with 27.2% average. Top scores: IFEval (64.8%), MATH Level 5 (36.8%), BBH (HuggingFace) (25.8%).
Capabilities
reasoning
7.6
#149 globally
math
36.8
#115 globally
knowledge
14.0
#205 globally
language
64.8
#79 globally
general
25.8
#34 globally
Benchmark Scores
Compare AllTested on 6 benchmarks · Ranked across 5 categories
Score Distribution (all 231 models)
0255075100
▲ You are here
reasoningCompare reasoning →
MUSR
7.6—HuggingFace MuSR (Multi-Step Reasoning). Tests multi-hop reasoning requiring chaining multiple facts together.
mathCompare math →
MATH Level 5
36.8—HuggingFace evaluation of MATH Level 5 problems. Competition math requiring advanced reasoning and proof construction.
knowledgeCompare knowledge →
MMLU-PRO
25.1—HuggingFace MMLU-Pro. Harder version of MMLU with 10 answer choices instead of 4 and more challenging questions.
GPQA
3.0—HuggingFace evaluation of GPQA (Graduate-Level Google-Proof Q&A). PhD-level science questions that cannot be easily searched.
Excellent (85+) Good (70-85) Average (50-70) Below (<50)
Links
Info
Research
Documentation
Community
BenchGecko API
qwen-qwen25-3b-instruct
Specifications
- Typetext-generation
- ContextN/A
- ReleasedSep 2024
- LicenseProprietary
- StatusActive
Available On
Learn More
Share & Export
Frequently Asked Questions
Qwen2.5 3B Instruct is a proprietary text-generation AI model by Alibaba, released in September 2024. It has an average benchmark score of 42.9.