How much does StarCoder 2 15B cost?

StarCoder 2 15B is open source and can be self-hosted.

What benchmarks has StarCoder 2 15B been tested on?

StarCoder 2 15B has been evaluated on 10 benchmarks. Top scores: GSM8K: 57.7, MMLU: 52.1, ARC AI2: 29.6.

Is StarCoder 2 15B open source?

Yes, StarCoder 2 15B is open source.

How does StarCoder 2 15B compare to Meta Llama 3 8B Instruct?

StarCoder 2 15B has an average score of 30.9 while Meta Llama 3 8B Instruct scores 30.9. Meta Llama 3 8B Instruct slightly outperforms StarCoder 2 15B overall. See full comparison →

Home/Models/StarCoder 2 15B

StarCoder 2 15B

Name: StarCoder 2 15B
Author: Unknown

by Unknown · Released Jan 2024

Open Source

30.9

avg score

Rank #214

Compare

Better than 22% of all models

Context

N/A

Input $/1M

TBD

Output $/1M

TBD

Type

text

License

Open Source

Benchmarks

10 tested

Data updated today

About

Tested on 10 benchmarks with 24.3% average. Top scores: GSM8K (57.7%), MMLU (52.1%), ARC AI2 (29.6%).

Capabilities

reasoning

2.9

#201 globally

math

31.8

#150 globally

knowledge

25.7

#221 globally

language

27.8

#146 globally

general

20.4

#44 globally

Benchmark Scores

Compare All

Tested on 10 benchmarks · Ranked across 5 categories

Score Distribution (all 274 models)

0255075100

▲ You are here

reasoningCompare reasoning →

MUSR

HuggingFace MuSR (Multi-Step Reasoning). Tests multi-hop reasoning requiring chaining multiple facts together.

2.9—

mathCompare math →

GSM8K

Grade school math word problems. 8,500 problems testing multi-step arithmetic reasoning. A foundational math benchmark.

57.7—

MATH Level 5

HuggingFace evaluation of MATH Level 5 problems. Competition math requiring advanced reasoning and proof construction.

6.0—

knowledgeCompare knowledge →

MMLU

Massive Multitask Language Understanding. 57 subjects from STEM, humanities, and social sciences. The most widely-cited knowledge benchmark.

52.1—

ARC AI2

AI2 Reasoning Challenge. Grade-school science questions requiring multi-step reasoning. Easy and Challenge sets test different difficulty levels.

29.6—

Winogrande

Commonsense coreference resolution. Tests understanding of pronoun references in ambiguous sentences.

28.6—

Quick compare:

vs Meta Llama 3 8B Instruct

vs Llama 3 70B Instruct

vs vicuna-13b-v1.1

Excellent (85+) Good (70-85) Average (50-70) Below (<50)

Similar Models

Meta Llama 3 8B Instruct

Frequently Asked Questions

StarCoder 2 15B is an open-source text AI model by Unknown, released in January 2024. It has an average benchmark score of 30.9.

Benchmarks

GSM8K MMLU ARC AI2 Winogrande IFEval

Unknown · Provider Unknown · Economy All Models Compare Models Pricing Developers · API

StarCoder 2 15B

Frequently Asked Questions

Related Models

Benchmarks

Related Pages