How much does Llama 3.3 70B Instruct cost?

Llama 3.3 70B Instruct costs $0.10 per million input tokens and $0.32 per million output tokens. For a typical conversation (~2,000 tokens), that's approximately $0.001 per message.

What benchmarks has Llama 3.3 70B Instruct been tested on?

Llama 3.3 70B Instruct has been evaluated on 8 benchmarks. Top scores: Chatbot Arena Elo — Overall: 1318.0, IFEval: 90.0, Aider — Code Editing: 59.4.

Is Llama 3.3 70B Instruct open source?

Yes, Llama 3.3 70B Instruct is open source.

How does Llama 3.3 70B Instruct compare to GPT-5.2 Pro?

Llama 3.3 70B Instruct has an average score of 75.9 while GPT-5.2 Pro scores 76.2. GPT-5.2 Pro slightly outperforms Llama 3.3 70B Instruct overall. Llama 3.3 70B Instruct costs $0.10/1M input vs GPT-5.2 Pro at $21.00/1M input. See full comparison →

Home/Models/Llama 3.3 70B Instruct

Llama 3.3 70B Instruct

Name: Llama 3.3 70B Instruct
Price: 0.1 USD
Author: Meta

by Meta · Released Dec 2024

Open Source

75.9

avg score

Rank #30

Compare

Better than 87% of all models

Context

131K tokens (~66 books)

Input $/1M

$0.10

Output $/1M

$0.32

Type

text

License

Open Source

Benchmarks

8 tested

Data updated today

About

The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction tuned generative model in 70B (text in/text out). The Llama 3.3 instruction tuned text only model...

Tested on 8 benchmarks with 46.9% average. Top scores: Chatbot Arena Elo — Overall (1318.0%), IFEval (90.0%), Aider — Code Editing (59.4%).

Looking for similar performance at lower cost?
gpt-oss-120b (free) scores 74.2 (98% as good) at $0.00/1M input · 100% cheaper

Capabilities

coding

59.4

#42 globally

reasoning

15.6

#110 globally

math

48.3

#79 globally

knowledge

29.3

#177 globally

language

90.0

#13 globally

general

56.6

#3 globally

Benchmark Scores

Compare All

Tested on 8 benchmarks · Ranked across 7 categories

Score Distribution (all 233 models)

0255075100

▲ You are here

codingCompare coding →

Aider — Code Editing

Code editing benchmark from the Aider project. Measures ability to apply targeted code changes while maintaining correctness and style.

59.4—

reasoningCompare reasoning →

MUSR

HuggingFace MuSR (Multi-Step Reasoning). Tests multi-hop reasoning requiring chaining multiple facts together.

15.6—

mathCompare math →

MATH Level 5

HuggingFace evaluation of MATH Level 5 problems. Competition math requiring advanced reasoning and proof construction.

48.3—

Quick compare:

vs GPT-5.2 Pro

vs Gemini 3 Pro

vs GPT-5.2

Excellent (85+) Good (70-85) Average (50-70) Below (<50)

Model Family · Meta Llama 3.3

Llama 3.3 70B InstructDec 2024

46.9

$0.10/M in131Kctx8 benchmarks

Llama 3.3 70B Instruct (free)Dec 2024

29.1-17.8

$0.00/M in(-0.10)66Kctx(-66K)8 benchmarks

See the full Llama 3.3 family →

Similar Models

Links

Info

Meta Pricing explorer Developers · API

Research

Documentation

Community

Source Code

BenchGecko API

llama-3-3-70b-instruct

Specifications

Typetext
Context131K tokens (~66 books)
ReleasedDec 2024
LicenseOpen Source
StatusActive
Cost / Message~$0.001

Available On

Meta$0.10

Frequently Asked Questions

Llama 3.3 70B Instruct is an open-source text AI model by Meta, released in December 2024. It has an average benchmark score of 75.9. Context window: 131K tokens.

Benchmarks

Chatbot Arena Elo — Overall IFEval Aider — Code Editing BBH (HuggingFace)MATH Level 5

Meta · Provider Meta · Economy All Models Compare Models Pricing Developers · API Context Window · Glossary

Llama 3.3 70B Instruct

Frequently Asked Questions

Related Models

Benchmarks

Related Pages