Compare · ModelsLive · 2 picked · head to head

Mistral Large 2411 vs Gemini 2.5 Flash

Side by side · benchmarks, pricing, and signals you can act on.

Winner summary

Gemini 2.5 Flash wins 7 of 8 shared benchmarks. Leads in arena · math · language.

Category leads
arena·Gemini 2.5 Flashmath·Gemini 2.5 Flashknowledge·Mistral Large 2411language·Gemini 2.5 Flashreasoning·Gemini 2.5 Flash
Hype vs Reality
Mistral Large 2411
#112 by perf·no signal
QUIET
Gemini 2.5 Flash
#144 by perf·#14 by attention
OVERHYPED
Best value
2.5x better value than Mistral Large 2411
Mistral Large 2411
11.4 pts/$
$4.00/M
Gemini 2.5 Flash
28.6 pts/$
$1.40/M
Vendor risk
Mistral AI logo
Mistral AI
$14.0B·Tier 1
Medium risk
Google DeepMind logo
Google DeepMind
$4.00T·Tier 1
Low risk
Head to head
Mistral Large 2411Gemini 2.5 Flash
Chatbot Arena Elo · Overall
Gemini 2.5 Flash leads by +106.4
Mistral Large 2411
1304.7
Gemini 2.5 Flash
1411.0
FrontierMath-2025-02-28-Private
Gemini 2.5 Flash leads by +4.5
FrontierMath (Feb 2025) · original research-level math problems created by mathematicians, testing capabilities at the boundary of current AI mathematical reasoning.
Mistral Large 2411
0.3
Gemini 2.5 Flash
4.8
HELM · GPQA
Mistral Large 2411 leads by +4.5
Mistral Large 2411
43.5
Gemini 2.5 Flash
39.0
HELM · IFEval
Gemini 2.5 Flash leads by +2.2
Mistral Large 2411
87.6
Gemini 2.5 Flash
89.8
HELM · MMLU-Pro
Gemini 2.5 Flash leads by +4.0
Mistral Large 2411
59.9
Gemini 2.5 Flash
63.9
HELM · Omni-MATH
Gemini 2.5 Flash leads by +10.3
Mistral Large 2411
28.1
Gemini 2.5 Flash
38.4
HELM · WildBench
Gemini 2.5 Flash leads by +1.6
Mistral Large 2411
80.1
Gemini 2.5 Flash
81.7
OTIS Mock AIME 2024-2025
Gemini 2.5 Flash leads by +65.3
OTIS Mock AIME 2024-2025 · simulated American Invitational Mathematics Examination problems testing advanced problem-solving skills.
Mistral Large 2411
7.7
Gemini 2.5 Flash
73.0
Full benchmark table
BenchmarkMistral Large 2411Gemini 2.5 Flash
Chatbot Arena Elo · Overall
1304.71411.0
FrontierMath-2025-02-28-Private
FrontierMath (Feb 2025) · original research-level math problems created by mathematicians, testing capabilities at the boundary of current AI mathematical reasoning.
0.34.8
HELM · GPQA
43.539.0
HELM · IFEval
87.689.8
HELM · MMLU-Pro
59.963.9
HELM · Omni-MATH
28.138.4
HELM · WildBench
80.181.7
OTIS Mock AIME 2024-2025
OTIS Mock AIME 2024-2025 · simulated American Invitational Mathematics Examination problems testing advanced problem-solving skills.
7.773.0
Pricing · per 1M tokens · projected $/mo at 10M tokens
ModelInputOutputContextProjected $/mo
Mistral AI logoMistral Large 2411$2.00$6.00131K tokens (~66 books)$30.00
Google DeepMind logoGemini 2.5 Flash$0.30$2.501.0M tokens (~524 books)$8.50