24.7
avg score
Rank #233
Better than 15% of all models
Context
N/A
Input $/1M
TBD
Output $/1M
TBD
Type
text
License
Open Source
Benchmarks
4 tested
Data updated today
About
Tested on 4 benchmarks with 23.1% average. Top scores: GSM8K (37.7%), MMLU (20.7%), Winogrande (19.6%).
Capabilities
math
37.7
#133 globally
knowledge
18.2
#239 globally
Benchmark Scores
Compare AllTested on 4 benchmarks · Ranked across 2 categories
Score Distribution (all 274 models)
0255075100
▲ You are here
mathCompare math →
GSM8K
37.7—Grade school math word problems. 8,500 problems testing multi-step arithmetic reasoning. A foundational math benchmark.
knowledgeCompare knowledge →
MMLU
20.7—Massive Multitask Language Understanding. 57 subjects from STEM, humanities, and social sciences. The most widely-cited knowledge benchmark.
Winogrande
19.6—Commonsense coreference resolution. Tests understanding of pronoun references in ambiguous sentences.
ARC AI2
14.3—AI2 Reasoning Challenge. Grade-school science questions requiring multi-step reasoning. Easy and Challenge sets test different difficulty levels.
Excellent (85+) Good (70-85) Average (50-70) Below (<50)
Links
Research
Documentation
Community
Source Code
BenchGecko API
codeqwen1-5-7b
Specifications
- Typetext
- ContextN/A
- ReleasedJan 2024
- LicenseOpen Source
- Statusbenchmark-only
Available On
Learn More
Share & Export
Frequently Asked Questions
CodeQwen1.5-7B is an open-source text AI model by Alibaba Qwen, released in January 2024. It has an average benchmark score of 24.7.