Mistral Medium 3 is a high-performance enterprise-grade language model designed to deliver frontier-level capabilities at significantly reduced operational cost. It balances state-of-the-art reasoning and multimodal performance with 8× lower cost...
Tested on 4 benchmarks with 40.0% average. Top scores: MATH level 5 (81.6%), GPQA diamond (46.0%), OTIS Mock AIME 2024-2025 (32.1%).
Llama 3 8B Instruct scores 41.7 (101% as good) at $0.03/1M input · 93% cheaper
Competition-level math from AMC, AIME, and olympiad problems. Level 5 is the hardest tier, requiring creative problem-solving.
Mock AIME (American Invitational Mathematics Exam) problems from OTIS. Tests mathematical competition performance.
Original research-level math problems created by professional mathematicians. Problems are unpublished and cannot be memorized.
Graduate-level science questions written by PhD experts. Diamond subset contains questions where experts disagree, testing deep understanding.
- Typemultimodal
- Context131K tokens (~66 books)
- ReleasedMay 2025
- LicenseOpen Source
- StatusActive
- Cost / Message~$0.003