This is Mistral AI's flagship model, Mistral Large 2 (version `mistral-large-2407`). It's a proprietary weights-available model and excels at reasoning, code, JSON, chat, and more. Read the launch announcement [here](https://mistral.ai/news/mistral-large-2407/)....
Tested on 8 benchmarks with 30.0% average. Top scores: Lech Mazur Writing (69.0%), Aider — Code Editing (60.2%), MMLU (58.4%).
Voxtral Small 24B 2507 scores 26.3 (98% as good) at $0.10/1M input · 95% cheaper
Code editing benchmark from the Aider project. Measures ability to apply targeted code changes while maintaining correctness and style.
Deceptively simple questions that humans find easy but AI models often get wrong. Tests common sense and reasoning gaps.
Competition-level math from AMC, AIME, and olympiad problems. Level 5 is the hardest tier, requiring creative problem-solving.
Mock AIME (American Invitational Mathematics Exam) problems from OTIS. Tests mathematical competition performance.
Original research-level math problems created by professional mathematicians. Problems are unpublished and cannot be memorized.
- Typetext
- Context128K tokens (~64 books)
- ReleasedFeb 2024
- LicenseOpen Source
- StatusActive
- Cost / Message~$0.010