Better than 13% of all models
Context
N/A
Input $/1M
TBD
Output $/1M
TBD
Type
text-generation
License
Open Source
Benchmarks
6 tested
Data updated today
About
Qwen text generation model. 6219K downloads on HuggingFace.
Tested on 6 benchmarks with 47.2% average. Top scores: OpenCompass — IFEval (82.4%), OpenCompass — MMLU-Pro (63.0%), OpenCompass — GPQA-Diamond (52.3%).
Capabilities
coding
33.5
#112 globally
math
46.9
#84 globally
knowledge
40.1
#148 globally
language
82.4
#43 globally
Benchmark Scores
Compare AllTested on 6 benchmarks · Ranked across 4 categories
Score Distribution (all 231 models)
0255075100
▲ You are here
codingCompare coding →
OpenCompass — LiveCodeBenchV6
33.5—OpenCompass Live Code Bench v6. Fresh competitive programming problems to evaluate code generation without memorization.
mathCompare math →
OpenCompass — AIME2025
46.9—OpenCompass evaluation on AIME 2025 problems. Tests mathematical reasoning on fresh competition problems.
knowledgeCompare knowledge →
OpenCompass — MMLU-Pro
63.0—OpenCompass MMLU-Pro evaluation. Harder knowledge test with more answer choices.
OpenCompass — GPQA-Diamond
52.3—OpenCompass evaluation of GPQA Diamond. PhD-level science questions from the hardest subset.
OpenCompass — HLE
5.1—OpenCompass evaluation of Humanitys Last Exam. Expert-level cross-discipline knowledge test.
Excellent (85+) Good (70-85) Average (50-70) Below (<50)
Links
Info
Research
Documentation
Community
Source Code
BenchGecko API
qwen-qwen3-4b-instruct-2507
Specifications
- Typetext-generation
- ContextN/A
- ReleasedAug 2025
- LicenseOpen Source
- StatusActive
Available On
Learn More
Share & Export
Frequently Asked Questions
Qwen3 4B Instruct 2507 is an open-source text-generation AI model by Alibaba, released in August 2025. It has an average benchmark score of 20.3.