Benchmark · MathSettled

MATH level 5

MATH Level 5 · the hardest tier of the MATH benchmark, featuring competition-level problems from AMC, AIME, and Olympiad-style mathematics.

Updated 2025-10-15
Models tested
72
Top score
98.1
GPT-5
Median
62.7
min 3.3
Top-5 spread
σ 0.1
Settled

Best score over time · one chart, every benchmark

MATH LEVEL 545 MODELS · FRONTIER RUNNING MAX0255075100SCORE ↑Jun 24Oct 24Feb 25Jun 25Oct 25RELEASE DATE →benchgecko.ai/benchmark/math-level-5 · frontier
Frontier on MATH level 5 rose from 52.6 to 98.1 in 13 months · +45.5 points · latest leader GPT-5 from OpenAI.
Pink dots = frontier records · 8 totalClick to open model page

72 models tested · sorted by score

#ModelScore
1OpenAI logoGPT-598.1
2OpenAI logoGPT-5 Mini97.8
3OpenAI logoo4 Mini97.8
4OpenAI logoo397.8
5Anthropic logoClaude Sonnet 4.597.7
6Alibaba Qwen logoQwen3 Max97.1
7DeepSeek logoR1 052896.6
8OpenAI logoo3 Mini96.5
9Anthropic logoClaude Haiku 4.596.4
10Google DeepMind logoGemini 2.5 Pro95.6
11OpenAI logoGPT-5 Nano95.2
12OpenAI logoo194.7
13DeepSeek logoR193.0
14Anthropic logoClaude 3.7 Sonnet91.2
15xAI logoGrok 3 Mini90.9
16OpenAI logoo1-mini89.2
17xAI logoGrok 388.8
18OpenAI logoGPT-4.1 Mini87.3
19Anthropic logoClaude Opus 485.0
20Anthropic logoClaude Sonnet 484.4
21Google DeepMind logoGemini 2.0 Pro83.5
22OpenAI logoGPT-4.183.0
23Google DeepMind logoGemini 2.0 Flash82.2
24OpenAI logoo1-preview81.7
25Mistral AI logoMistral Medium 381.6
26OpenAI logoGPT-4.578.6
27Google DeepMind logoGemma 3 27B74.0
28Google DeepMind logoGemma 3 27B (free)74.0
29Meta logoLlama 4 Maverick73.0
30OpenAI logoGPT-4.1 Nano70.0
31Alibaba Qwen logoQwen3 235B A22B68.9
32Alibaba Qwen logoQwen2.5-Max67.2
33Microsoft logoPhi 464.9
34DeepSeek logoDeepSeek V364.8
35xAI logoGrok-2 (Dec 2024)63.5
36Alibaba Qwen logoQwen2.5 72B Instruct63.2
37Meta logoLlama 4 Scout62.3
38OpenAI logoGPT-4o (2024-08-06)53.3
39OpenAI logoGPT-4o (2024-11-20)53.3
40OpenAI logoGPT-4o-mini52.6
41OpenAI logoGPT-4o-mini (2024-07-18)52.6
42Anthropic logoClaude 3.5 Sonnet51.7
43OpenAI logoGPT-4o (2024-05-13)51.0
44Mistral AI logoMistral Large 241150.3
45Meta logoLlama 3.1 405B49.8
46Anthropic logoClaude 3.5 Haiku46.4
47Mistral AI logoMistral Large 240744.8
48Meta logoLlama 3.3 70B Instruct (free)41.6
49Google DeepMind logoGemini 1.5 Pro (Feb 2024)40.8
50Meta logoLlama 3.2 90B39.4
51Alibaba Qwen logoQwen2-72B39.1
52Anthropic logoClaude 3 Opus37.5
53Meta logoLlama 3.1 70B Instruct36.7
54Google DeepMind logoGemma 2 27B27.9
55Google DeepMind logoGemini 1.5 Flash (May 2024)25.1
56Mistral AI logoMistral Large24.5
57Mistral AI logoMixtral 8x22B Instruct24.2
58OpenAI logoGPT-4 Turbo23.0
59Meta logoLlama 3.1 8B Instruct22.9
60Meta logoLlama 3 70B Instruct22.6
61Google DeepMind logoGemma 2 9B21.0
62Anthropic logoClaude 3 Sonnet18.2
63Microsoft logophi-3-medium 14B17.6
64Anthropic logoClaude 3 Haiku14.9
65Anthropic logoClaude 211.7
66OpenAI logoGPT-3.5 Turbo (older v0613)11.6
67Google DeepMind logoGemini 1.0 Pro11.2
68Mistral AI logoMistral Nemo10.8
69Mistral AI logoMixtral 8x7B Instruct9.9
70Meta logoLlama 3 8B Instruct6.1
71
U
Yi 6B
5.2
72Meta logoLlama 2-13B3.3

Same category · related evaluations