This is Mistral AI's flagship model, Mistral Large 2 (version mistral-large-2407). It's a proprietary weights-available model and excels at reasoning, code, JSON, chat, and more. Read the launch announcement [here](https://mistral.ai/news/mistral-large-2407/)....
Tested on 7 benchmarks with 39.1% average. Top scores: Chatbot Arena Elo — Overall (1313.3%), MMLU (73.3%), Lech Mazur Writing (69.0%).
Gemma 3 27B (free) scores 35.0 (100% as good) at $0.00/1M input · 100% cheaper
Deceptively simple questions that humans find easy but AI models often get wrong. Tests common sense and reasoning gaps.
Competition-level math from AMC, AIME, and olympiad problems. Level 5 is the hardest tier, requiring creative problem-solving.
Mock AIME (American Invitational Mathematics Exam) problems from OTIS. Tests mathematical competition performance.
Massive Multitask Language Understanding. 57 subjects from STEM, humanities, and social sciences. The most widely-cited knowledge benchmark.
Writing quality evaluation by Lech Mazur. Tests prose quality, coherence, and stylistic ability.
Graduate-level science questions written by PhD experts. Diamond subset contains questions where experts disagree, testing deep understanding.
- Typetext
- Context131K tokens (~66 books)
- ReleasedNov 2024
- LicenseOpen Source
- StatusActive
- Cost / Message~$0.010