Qwen2.5 7B is the latest series of Qwen large language models. Qwen2.5 brings the following improvements upon Qwen2: - Significantly more knowledge and has greatly improved capabilities in coding and...
Tested on 6 benchmarks with 35.2% average. Top scores: IFEval (75.8%), MATH Level 5 (50.0%), MMLU-PRO (36.5%).
HuggingFace MuSR (Multi-Step Reasoning). Tests multi-hop reasoning requiring chaining multiple facts together.
HuggingFace evaluation of MATH Level 5 problems. Competition math requiring advanced reasoning and proof construction.
HuggingFace MMLU-Pro. Harder version of MMLU with 10 answer choices instead of 4 and more challenging questions.
HuggingFace evaluation of GPQA (Graduate-Level Google-Proof Q&A). PhD-level science questions that cannot be easily searched.
- Typetext
- Context33K tokens (~16 books)
- ReleasedOct 2024
- LicenseOpen Source
- StatusActive
- Cost / Message~$0.000