Llama 3.2 3B is a 3-billion-parameter multilingual large language model, optimized for advanced natural language processing tasks like dialogue generation, reasoning, and summarization. Designed with the latest transformer architecture, it...
Tested on 7 benchmarks with 24.2% average. Top scores: Chatbot Arena Elo — Overall (1166.1%), IFEval (73.9%), MMLU-PRO (24.4%).
Llama 3.1 8B Instruct scores 35.2 (98% as good) at $0.02/1M input · 61% cheaper
HuggingFace MuSR (Multi-Step Reasoning). Tests multi-hop reasoning requiring chaining multiple facts together.
HuggingFace evaluation of MATH Level 5 problems. Competition math requiring advanced reasoning and proof construction.
HuggingFace MMLU-Pro. Harder version of MMLU with 10 answer choices instead of 4 and more challenging questions.
HuggingFace evaluation of GPQA (Graduate-Level Google-Proof Q&A). PhD-level science questions that cannot be easily searched.
- Typetext
- Context131K tokens (~66 books)
- ReleasedSep 2024
- LicenseOpen Source
- StatusActive
- Cost / Message~$0.000