Mistralai text generation model. 467K downloads on HuggingFace.
Tested on 16 benchmarks with 41.6% average. Top scores: TriviaQA (75.2%), HellaSwag (74.7%), OpenBookQA (73.1%).
BIG-Bench Hard. 23 challenging tasks from BIG-Bench where prior language models fell below average human performance.
HuggingFace MuSR (Multi-Step Reasoning). Tests multi-hop reasoning requiring chaining multiple facts together.
Grade school math word problems. 8,500 problems testing multi-step arithmetic reasoning. A foundational math benchmark.
HuggingFace evaluation of MATH Level 5 problems. Competition math requiring advanced reasoning and proof construction.
Trivia questions sourced from trivia enthusiasts and quiz websites. Tests breadth of general knowledge.
Sentence completion requiring commonsense reasoning about physical and social situations. Tests real-world understanding.
Elementary science questions with access to a small book of core science facts. Tests reasoning beyond memorization.
- Typetext-generation
- ContextN/A
- ReleasedSep 2023
- LicenseOpen Source
- StatusActive