Mistral's official instruct fine-tuned version of [Mixtral 8x22B](/models/mistralai/mixtral-8x22b). It uses 39B active parameters out of 141B, offering unparalleled cost efficiency for its size. Its strengths include: - strong math, coding,...
Tested on 5 benchmarks with 23.5% average. Top scores: MMLU (70.4%), MATH level 5 (24.2%), GPQA diamond (12.1%).
Gemma 3 27B scores 25.1 (102% as good) at $0.08/1M input · 96% cheaper
Capture-the-flag cybersecurity challenges. Tests vulnerability analysis, reverse engineering, cryptography, and exploitation skills.
Unusual and adversarial machine learning challenges. Tests robustness of reasoning about edge cases in ML systems.
Competition-level math from AMC, AIME, and olympiad problems. Level 5 is the hardest tier, requiring creative problem-solving.
Massive Multitask Language Understanding. 57 subjects from STEM, humanities, and social sciences. The most widely-cited knowledge benchmark.
Graduate-level science questions written by PhD experts. Diamond subset contains questions where experts disagree, testing deep understanding.
- Typetext
- Context66K tokens (~33 books)
- ReleasedApr 2024
- LicenseOpen Source
- StatusActive
- Cost / Message~$0.010