Beta
Model familiesReading · ~3 min · 76 words deep

Llama

Meta's open-weight model family · the most-deployed open LLM on earth, with 1B+ downloads.

Llama family on /family
TL;DR

Meta's open-weight model family · the most-deployed open LLM on earth, with 1B+ downloads.

Level 1

Llama (Large Language Model Meta AI) has powered the open-weight AI ecosystem since 2023. Versions: Llama 1 (Feb 2023), Llama 2 (Jul 2023), Llama 3 (Apr 2024), Llama 3.1 (Jul 2024), Llama 4 (Apr 2025). Meta ships the weights under a custom license (commercial use allowed with some restrictions). The open release has made Llama the foundation layer for thousands of downstream models and products.

Level 2

Llama architecture: decoder-only transformer with RoPE positional encoding, SwiGLU activation, RMSNorm. Modern variants (Llama 3.1+) use Grouped-Query Attention and large vocabularies (128K). Llama 4 is an MoE architecture with 10M token context (experimental). Meta trains on their own infrastructure · 24K H100 clusters for Llama 3, 70K+ for Llama 4. Tier structure: 8B (mobile/edge), 70B (production workhorse), 405B+ (frontier). The ecosystem includes Llama Guard (safety), Code Llama (specialized), and numerous third-party fine-tunes (Dolphin, Hermes, OpenChat, etc.).

Level 3

Llama 3 architecture details (publicly disclosed): 8B has 32 layers, 4096 hidden, 32 attention heads, 8 KV heads (GQA), 128K vocab with tiktoken-like tokenizer, 8K context (extended to 128K via RoPE scaling in 3.1). 70B has 80 layers, 8192 hidden. 405B has 126 layers, 16K hidden. Training: 15T tokens for 3.1, estimated 30T+ for Llama 4. Post-training: SFT + DPO + RLHF blend, with iterative quality filtering. License: "Llama Community License" · commercial use allowed with some restrictions around 700M+ monthly active users and model-output-usage reporting. The open release has had outsized impact on research and deployment.

Why this matters now

Llama 4 10M context + MoE architecture closed most of the gap to closed frontier models · open-weight deployment accelerated in 2026.

The takeaway for you
If you are a
Researcher
  • ·RoPE, SwiGLU, RMSNorm, GQA · the open-source transformer stack
  • ·Training scale 15-30T tokens · estimated 5-10% of GPT-5 class compute
  • ·Meta's release cadence drives open-source research tempo
If you are a
Builder
  • ·Llama 3.1 70B for self-hosted production · Llama 4 MoE if you can afford the VRAM
  • ·Quantized variants (GGUF, AWQ) run on consumer hardware
  • ·Fine-tune via LoRA · massive community of ready-made adapters
If you are a
Investor
  • ·Llama's open release commoditizes the base-model layer
  • ·Benefits: Meta retains research talent, accelerates internal AI product launches
  • ·Downside: reduces closed-model API pricing power long-term
If you are a
Curious · Normie
  • ·Meta (Facebook)'s AI · you can download and run it yourself
  • ·Powers open-source AI apps all over the internet
  • ·Free to use with some conditions
Gecko's take

Llama is the reason open-source AI isn't stuck at 2023-level quality. Meta's bet pays off every time a researcher ships on top of it.

Yes, with conditions. The Llama Community License allows commercial use for most applications, with special terms for services exceeding 700M monthly active users.