How much does Qwen2.5-Coder-3B cost?

Qwen2.5-Coder-3B is open source and can be self-hosted.

What benchmarks has Qwen2.5-Coder-3B been tested on?

Qwen2.5-Coder-3B has been evaluated on 4 benchmarks. Top scores: GSM8K: 75.7, HellaSwag: 61.2, ARC AI2: 37.2.

Is Qwen2.5-Coder-3B open source?

Yes, Qwen2.5-Coder-3B is open source.

How does Qwen2.5-Coder-3B compare to GLM 4.7?

Qwen2.5-Coder-3B has an average score of 54.6 while GLM 4.7 scores 54.9. GLM 4.7 slightly outperforms Qwen2.5-Coder-3B overall. See full comparison →

Home/Models/Qwen2.5-Coder-3B

Qwen2.5-Coder-3B

Name: Qwen2.5-Coder-3B
Author: Alibaba Qwen

by Alibaba Qwen · Released Jan 2024

Open Source

54.6

avg score

Rank #114

Compare

Better than 58% of all models

Context

N/A

Input $/1M

TBD

Output $/1M

TBD

Type

text

License

Open Source

Benchmarks

4 tested

Data updated today

About

Tested on 4 benchmarks with 52.2% average. Top scores: GSM8K (75.7%), HellaSwag (61.2%), ARC AI2 (37.2%).

Capabilities

math

75.7

#31 globally

knowledge

44.4

#154 globally

Benchmark Scores

Compare All

Tested on 4 benchmarks · Ranked across 2 categories

Score Distribution (all 274 models)

0255075100

▲ You are here

mathCompare math →

GSM8K

Grade school math word problems. 8,500 problems testing multi-step arithmetic reasoning. A foundational math benchmark.

75.7—

knowledgeCompare knowledge →

HellaSwag

Sentence completion requiring commonsense reasoning about physical and social situations. Tests real-world understanding.

61.2—

ARC AI2

AI2 Reasoning Challenge. Grade-school science questions requiring multi-step reasoning. Easy and Challenge sets test different difficulty levels.

37.2—

Winogrande

Commonsense coreference resolution. Tests understanding of pronoun references in ambiguous sentences.

34.8—

Quick compare:

vs GLM 4.7

vs Phi 4

vs GPT-5.1

Excellent (85+) Good (70-85) Average (50-70) Below (<50)

Similar Models

Links

Info

Alibaba Qwen Pricing explorer Developers · API

Research

Documentation

Community

Source Code

BenchGecko API

qwen2-5-coder-3b

Specifications

Typetext
ContextN/A
ReleasedJan 2024
LicenseOpen Source
Statusbenchmark-only

Available On

Alibaba QwenTBD

Frequently Asked Questions

Qwen2.5-Coder-3B is an open-source text AI model by Alibaba Qwen, released in January 2024. It has an average benchmark score of 54.6.

Benchmarks

GSM8K HellaSwag ARC AI2 Winogrande

Alibaba Qwen · Provider Alibaba Qwen · Economy All Models Compare Models Pricing Developers · API

Qwen2.5-Coder-3B

Frequently Asked Questions

Related Models

Benchmarks

Related Pages