How much does Qwen2.5 Coder 1.5B Instruct cost?

Qwen2.5 Coder 1.5B Instruct is open source and can be self-hosted.

What benchmarks has Qwen2.5 Coder 1.5B Instruct been tested on?

Qwen2.5 Coder 1.5B Instruct has been evaluated on 6 benchmarks. Top scores: GSM8K: 65.8, HellaSwag: 49.1, MMLU: 38.1.

Is Qwen2.5 Coder 1.5B Instruct open source?

Yes, Qwen2.5 Coder 1.5B Instruct is open source.

How does Qwen2.5 Coder 1.5B Instruct compare to GPT-4o (2024-11-20)?

Qwen2.5 Coder 1.5B Instruct has an average score of 36.5 while GPT-4o (2024-11-20) scores 36.3. Qwen2.5 Coder 1.5B Instruct outperforms GPT-4o (2024-11-20) overall. See full comparison →

Home/Models/Qwen2.5 Coder 1.5B Instruct

Qwen2.5 Coder 1.5B Instruct

Name: Qwen2.5 Coder 1.5B Instruct
Author: Alibaba

by Alibaba · Released Sep 2024

Open Source

36.5

avg score

Rank #157

Compare

Better than 32% of all models

Context

N/A

Input $/1M

TBD

Output $/1M

TBD

Type

text-generation

License

Open Source

Benchmarks

6 tested

Data updated today

About

Qwen text generation model. 855K downloads on HuggingFace.

Tested on 6 benchmarks with 38.8% average. Top scores: GSM8K (65.8%), HellaSwag (49.1%), MMLU (38.1%).

Capabilities

coding

31.6

#114 globally

math

65.8

#37 globally

knowledge

33.9

#164 globally

Benchmark Scores

Compare All

Tested on 6 benchmarks · Ranked across 3 categories

Score Distribution (all 231 models)

0255075100

▲ You are here

codingCompare coding →

Aider — Code Editing

Code editing benchmark from the Aider project. Measures ability to apply targeted code changes while maintaining correctness and style.

31.6—

mathCompare math →

GSM8K

Grade school math word problems. 8,500 problems testing multi-step arithmetic reasoning. A foundational math benchmark.

65.8—

knowledgeCompare knowledge →

HellaSwag

Sentence completion requiring commonsense reasoning about physical and social situations. Tests real-world understanding.

49.1—

MMLU

Massive Multitask Language Understanding. 57 subjects from STEM, humanities, and social sciences. The most widely-cited knowledge benchmark.

38.1—

ARC AI2

AI2 Reasoning Challenge. Grade-school science questions requiring multi-step reasoning. Easy and Challenge sets test different difficulty levels.

26.9—

Quick compare:

vs GPT-4o (2024-11-20)

vs Qwen2.5-Max

vs Llama 3.2 3B Instruct

Excellent (85+) Good (70-85) Average (50-70) Below (<50)

Model Family · Alibaba Qwen 2.5

Qwen2.5 0.5B InstructSep 2024

10.1

N/AN/Actx6 benchmarks

Qwen2.5 1.5B InstructSep 2024

18.4+8.3

N/AN/Actx6 benchmarks

Qwen2.5 1.5B Instruct AWQSep 2024

N/AN/Actx

Qwen2.5 1.5B Instruct GGUFSep 2024

N/AN/Actx

Qwen2.5 14B InstructSep 2024

41.6+41.6

N/AN/Actx6 benchmarks

Qwen2.5 14B Instruct AWQSep 2024

N/AN/Actx

Qwen2.5 32B InstructSep 2024

43.2+43.2

N/AN/Actx7 benchmarks

Qwen2.5 32B Instruct AWQSep 2024

N/AN/Actx

Qwen2.5 32B Instruct GPTQ Int4Sep 2024

N/AN/Actx

Qwen2.5 3B InstructSep 2024

27.2+27.2

N/AN/Actx6 benchmarks

Qwen2.5 3B Instruct GGUFSep 2024

N/AN/Actx

Qwen2.5 72B Instruct AWQSep 2024

N/AN/Actx

Qwen2.5 7B Instruct AWQSep 2024

N/AN/Actx

Qwen2.5 Coder 0.5B InstructNov 2024

14.3+14.3

N/AN/Actx1 benchmark

Qwen2.5 Coder 1.5B InstructSep 2024

38.8+24.5

N/AN/Actx6 benchmarks

Qwen2.5 Coder 14B InstructNov 2024

37.4-1.4

N/AN/Actx7 benchmarks

Qwen2.5 Coder 32B Instruct AWQNov 2024

N/AN/Actx

Qwen2.5 Coder 7B Instruct AWQSep 2024

N/AN/Actx

Qwen2.5 Coder 7B Instruct GPTQ Int4Sep 2024

N/AN/Actx

Qwen2.5 Math 1.5BSep 2024

N/AN/Actx

Qwen2.5 VL 3B InstructJan 2025

N/AN/Actx

Qwen2.5 VL 7B InstructJan 2025

N/AN/Actx

Qwen2.5 VL 7B Instruct AWQFeb 2025

N/AN/Actx

See the full Qwen 2.5 family →

Similar Models

Llama 3.2 3B Instruct

Frequently Asked Questions

Qwen2.5 Coder 1.5B Instruct is an open-source text-generation AI model by Alibaba, released in September 2024. It has an average benchmark score of 36.5.

Benchmarks

GSM8K HellaSwag MMLU Aider — Code Editing ARC AI2

Alibaba · Provider All Models Compare Models

Qwen2.5 Coder 1.5B Instruct

Frequently Asked Questions

Related Models

Benchmarks

Related Pages