DeepSeek R1 Distill Qwen 32B is a distilled large language model based on [Qwen 2.5 32B](https://huggingface.co/Qwen/Qwen2.5-32B), using outputs from [DeepSeek R1](/deepseek/deepseek-r1). It outperforms OpenAI's o1-mini across various benchmarks, achieving new...
Tested on 6 benchmarks with 23.0% average. Top scores: IFEval (41.9%), MMLU-PRO (41.0%), BBH (HuggingFace) (17.1%).
Llama 3 8B Instruct scores 41.7 (99% as good) at $0.03/1M input · 90% cheaper
HuggingFace MuSR (Multi-Step Reasoning). Tests multi-hop reasoning requiring chaining multiple facts together.
HuggingFace evaluation of MATH Level 5 problems. Competition math requiring advanced reasoning and proof construction.
HuggingFace MMLU-Pro. Harder version of MMLU with 10 answer choices instead of 4 and more challenging questions.
HuggingFace evaluation of GPQA (Graduate-Level Google-Proof Q&A). PhD-level science questions that cannot be easily searched.
- Typetext
- Context33K tokens (~16 books)
- ReleasedJan 2025
- LicenseOpen Source
- StatusActive
- Cost / Message~$0.001