How much does Llama 4 Scout cost?

Llama 4 Scout costs $0.08 per million input tokens and $0.30 per million output tokens. For a typical conversation (~2,000 tokens), that's approximately $0.000 per message.

What benchmarks has Llama 4 Scout been tested on?

Llama 4 Scout has been evaluated on 11 benchmarks. Top scores: MATH level 5: 62.3, Fiction.LiveBench: 36.0, GPQA diamond: 35.8.

Is Llama 4 Scout open source?

Yes, Llama 4 Scout is open source.

How does Llama 4 Scout compare to Phi-1.5?

Llama 4 Scout has an average score of 15.3 while Phi-1.5 scores 15.6. Phi-1.5 slightly outperforms Llama 4 Scout overall. See full comparison →

Home/Models/Llama 4 Scout

Llama 4 Scout

Name: Llama 4 Scout
Price: 0.08 USD
Author: Meta

by Meta · Released Apr 2025

Open SourceMultimodal

15.3

avg score

Rank #215

Compare

Better than 8% of all models

Context

328K tokens (~164 books)

Input $/1M

$0.08

Output $/1M

$0.30

Type

multimodal

License

Open Source

Benchmarks

11 tested

Data updated today

About

Llama 4 Scout 17B Instruct (16E) is a mixture-of-experts (MoE) language model developed by Meta, activating 17 billion parameters out of a total of 109B. It supports native multimodal input...

Tested on 11 benchmarks with 18.9% average. Top scores: MATH level 5 (62.3%), Fiction.LiveBench (36.0%), GPQA diamond (35.8%).

Looking for similar performance at lower cost?
Llama 3.2 3B Instruct (free) scores 14.7 (96% as good) at $0.00/1M input · 100% cheaper

Capabilities

coding

9.1

#138 globally

reasoning

0.3

#184 globally

math

23.4

#149 globally

knowledge

35.9

#160 globally

speed

14.1

#61 globally

Benchmark Scores

Compare All

Tested on 11 benchmarks · Ranked across 5 categories

Score Distribution (all 233 models)

0255075100

▲ You are here

codingCompare coding →

SWE-Bench Verified (Bash Only)

SWE-bench Verified solved using only bash commands, no specialized frameworks. Tests raw terminal-based problem solving.

9.1—

reasoningCompare reasoning →

ARC-AGI

Abstraction and Reasoning Corpus. Tests fluid intelligence through novel visual pattern recognition puzzles. Core measure of general intelligence.

0.5—

ARC-AGI-2

ARC-AGI 2, harder sequel to ARC. More complex abstract reasoning patterns that test generalization ability beyond training data.

0.1—

mathCompare math →

MATH level 5

Competition-level math from AMC, AIME, and olympiad problems. Level 5 is the hardest tier, requiring creative problem-solving.

62.3—

OTIS Mock AIME 2024-2025

Mock AIME (American Invitational Mathematics Exam) problems from OTIS. Tests mathematical competition performance.

7.7—

FrontierMath-2025-02-28-Private

Original research-level math problems created by professional mathematicians. Problems are unpublished and cannot be memorized.

0.1—

Quick compare:

vs Phi-1.5

vs Llama 3.2 3B Instruct (free)

vs DeepSeek Coder 6.7B

Excellent (85+) Good (70-85) Average (50-70) Below (<50)

Model Family · Meta Llama 4

Llama 4 MaverickApr 2025

28.0

$0.15/M in1.0Mctx17 benchmarks

Llama 4 ScoutApr 2025

18.9-9.1

$0.08/M in(-0.07)328Kctx(-721K)11 benchmarks

Llama 4 Scout 17B 16E InstructApr 2025

N/AN/Actx1 benchmark

See the full Llama 4 family →

Similar Models

Phi-1.5

Microsoft

15.6TBD

Llama 3.2 3B Instruct (free)

Frequently Asked Questions

Llama 4 Scout is an open-source multimodal AI model by Meta, released in April 2025. It has an average benchmark score of 15.3. Context window: 328K tokens.

Benchmarks

MATH level 5 Fiction.LiveBench GPQA diamond Artificial Analysis — Quality Index SWE-Bench Verified (Bash Only)

Meta · Provider Meta · Economy All Models Compare Models Pricing Developers · API Context Window · Glossary

Llama 4 Scout

Frequently Asked Questions

Related Models

Benchmarks

Related Pages