Home/Models/Llama 3.3 70B Instruct
Meta logo

Llama 3.3 70B Instruct

by Meta · Released Dec 2024

Open Source
75.9
avg score
Rank #30
Compare
Better than 87% of all models
Context
131K tokens (~66 books)
Input $/1M
$0.10
Output $/1M
$0.32
Type
text
License
Open Source
Benchmarks
8 tested
Data updated today
About

The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction tuned generative model in 70B (text in/text out). The Llama 3.3 instruction tuned text only model...

Tested on 8 benchmarks with 46.9% average. Top scores: Chatbot Arena Elo — Overall (1318.0%), IFEval (90.0%), Aider — Code Editing (59.4%).

Looking for similar performance at lower cost?
gpt-oss-120b (free) scores 74.2 (98% as good) at $0.00/1M input · 100% cheaper
Capabilities
coding
59.4
#42 globally
reasoning
15.6
#110 globally
math
48.3
#79 globally
knowledge
29.3
#177 globally
language
90.0
#13 globally
general
56.6
#3 globally
Benchmark Scores
Compare All
Tested on 8 benchmarks · Ranked across 7 categories
Score Distribution (all 233 models)
0255075100
▲ You are here
Aider — Code Editing

Code editing benchmark from the Aider project. Measures ability to apply targeted code changes while maintaining correctness and style.

59.4
MUSR

HuggingFace MuSR (Multi-Step Reasoning). Tests multi-hop reasoning requiring chaining multiple facts together.

15.6
MATH Level 5

HuggingFace evaluation of MATH Level 5 problems. Competition math requiring advanced reasoning and proof construction.

48.3
Excellent (85+) Good (70-85) Average (50-70) Below (<50)
Links
Documentation
Community
BenchGecko API
llama-3-3-70b-instruct
Specifications
  • Typetext
  • Context131K tokens (~66 books)
  • ReleasedDec 2024
  • LicenseOpen Source
  • StatusActive
  • Cost / Message~$0.001
Available On
Meta logoMeta$0.10
Share & Export
Tweet
Llama 3.3 70B Instruct is an open-source text AI model by Meta, released in December 2024. It has an average benchmark score of 75.9. Context window: 131K tokens.