Home/Models/Llama 3.2 3B Instruct
Meta logo

Llama 3.2 3B Instruct

by Meta · Released Sep 2024

Open Source
35.9
avg score
Rank #162
Compare
Better than 30% of all models
Context
80K tokens (~40 books)
Input $/1M
$0.05
Output $/1M
$0.34
Type
text
License
Open Source
Benchmarks
7 tested
Data updated today
About

Llama 3.2 3B is a 3-billion-parameter multilingual large language model, optimized for advanced natural language processing tasks like dialogue generation, reasoning, and summarization. Designed with the latest transformer architecture, it...

Tested on 7 benchmarks with 24.2% average. Top scores: Chatbot Arena Elo — Overall (1165.7%), IFEval (73.9%), MMLU-PRO (24.4%).

Looking for similar performance at lower cost?
Gemma 3 27B (free) scores 35.0 (97% as good) at $0.00/1M input · 100% cheaper
Capabilities
reasoning
1.4
#180 globally
math
17.7
#173 globally
knowledge
14.1
#206 globally
language
73.9
#61 globally
general
24.1
#36 globally
Benchmark Scores
Compare All
Tested on 7 benchmarks · Ranked across 6 categories
Score Distribution (all 233 models)
0255075100
▲ You are here
MUSR

HuggingFace MuSR (Multi-Step Reasoning). Tests multi-hop reasoning requiring chaining multiple facts together.

1.4
MATH Level 5

HuggingFace evaluation of MATH Level 5 problems. Competition math requiring advanced reasoning and proof construction.

17.7
MMLU-PRO

HuggingFace MMLU-Pro. Harder version of MMLU with 10 answer choices instead of 4 and more challenging questions.

24.4
GPQA

HuggingFace evaluation of GPQA (Graduate-Level Google-Proof Q&A). PhD-level science questions that cannot be easily searched.

3.8
Excellent (85+) Good (70-85) Average (50-70) Below (<50)
Links
Documentation
Community
BenchGecko API
llama-3-2-3b-instruct
Specifications
  • Typetext
  • Context80K tokens (~40 books)
  • ReleasedSep 2024
  • LicenseOpen Source
  • StatusActive
  • Cost / Message~$0.000
Available On
Meta logoMeta$0.05
Share & Export
Tweet
Llama 3.2 3B Instruct is an open-source text AI model by Meta, released in September 2024. It has an average benchmark score of 35.9. Context window: 80K tokens.