Phi-4-mini-instruct is a lightweight open model built upon synthetic data and filtered publicly available websites - with a focus on high-quality, reasoning dense data. The model belongs to the Phi-4...
Tested on 7 benchmarks with 29.4% average. Top scores: IFEval (73.8%), BBH (HuggingFace) (38.7%), MMLU-PRO (32.6%).
HuggingFace MuSR (Multi-Step Reasoning). Tests multi-hop reasoning requiring chaining multiple facts together.
HuggingFace evaluation of MATH Level 5 problems. Competition math requiring advanced reasoning and proof construction.
HuggingFace MMLU-Pro. Harder version of MMLU with 10 answer choices instead of 4 and more challenging questions.
HuggingFace evaluation of GPQA (Graduate-Level Google-Proof Q&A). PhD-level science questions that cannot be easily searched.
- Typetext
- Context131K tokens (~66 books)
- ReleasedOct 2025
- LicenseOpen Source
- StatusActive
- Cost / Message~$0.001