Which model leads on FrontierMath-Tier-4-2025-07-01-Private?

GPT-5.4 Pro from OpenAI leads FrontierMath-Tier-4-2025-07-01-Private with a score of 37.5. The median score across 37 tested models is 4.2.

Is FrontierMath-Tier-4-2025-07-01-Private saturated?

No · the top score is 37.5 out of 100 (38%). There is still meaningful room for improvement on FrontierMath-Tier-4-2025-07-01-Private.

Does FrontierMath-Tier-4-2025-07-01-Private predict performance on other benchmarks?

Yes · FrontierMath-Tier-4-2025-07-01-Private scores correlate 0.86 with PostTrainBench across 12 shared models. Models that do well on FrontierMath-Tier-4-2025-07-01-Private tend to do well on PostTrainBench.

How often is FrontierMath-Tier-4-2025-07-01-Private data refreshed?

BenchGecko pulls updates daily. New model scores on FrontierMath-Tier-4-2025-07-01-Private appear as soon as they are published by Epoch AI or the model provider.

Benchmark · MathCompetitive

FrontierMath-Tier-4-2025-07-01-Private

Name: FrontierMath-Tier-4-2025-07-01-Private Benchmark
Creator: BenchGecko
License: https://creativecommons.org/licenses/by/4.0/

FrontierMath Tier 4 (Jul 2025) · the most challenging tier of frontier mathematics, containing problems that push the absolute limits of AI mathematical reasoning.

Updated 2026-03-05

Models tested

Top score

37.5

GPT-5.4 Pro

Median

4.2

min 0.1

Top-5 spread

σ 6.5

wide open

The Frontier

Best score over time · one chart, every benchmark

Chart type

Frontier on FrontierMath-Tier-4-2025-07-01-Private rose from 4.2 to 37.5 in 13 months · +33.3 points · latest leader GPT-5.4 Pro from OpenAI.

Pink dots = frontier records · 7 totalClick to open model page

Full rankings

37 models tested · sorted by score

#	Model	Score	Price
1	GPT-5.4 Pro· OpenAI	37.5	$30.00
2	GPT-5.2 Pro· OpenAI	31.3	$21.00
3	GPT-5.4· OpenAI	27.1	$2.50
4	Claude Opus 4.6· Anthropic	22.9	$5.00
5	GPT-5.2· OpenAI	18.8	$1.75
6	Gemini 3 Pro· Google DeepMind	18.8	—
7	Gemini 3.1 Pro Preview· Google DeepMind	16.7	$2.00
8	GPT-5 Pro· OpenAI	14.6	$15.00
9	U Muse Spark· Unknown	14.6	—
10	GPT-5· OpenAI	12.5	$1.25
11	GPT-5.1· OpenAI	12.5	$1.25
12	Claude Sonnet 4.6· Anthropic	8.3	$3.00
13	GPT-5 Mini· OpenAI	6.3	$0.25
14	o4 Mini· OpenAI	6.3	$1.10
15	Kimi K2.5· moonshotai	4.2	$0.44
16	Claude Opus 4· Anthropic	4.2	$15.00
17	Claude Opus 4.1· Anthropic	4.2	$15.00
18	Claude Opus 4.5· Anthropic	4.2	$5.00
19	Claude Sonnet 4.5· Anthropic	4.2	$3.00
20	Gemini 2.5 Flash· Google DeepMind	4.2	$0.30
21	Gemini 2.5 Pro· Google DeepMind	4.2	$1.25
22	Gemini 3 Flash Preview· Google DeepMind	4.2	$0.50
23	o3 Mini· OpenAI	4.2	$1.10
24	GLM 4.6· z-ai	2.1	$0.39
25	DeepSeek V3.2· DeepSeek	2.1	$0.25
26	GLM 5· z-ai	2.1	$0.60
27	Claude Haiku 4.5· Anthropic	2.1	$1.00
28	GPT-5 Nano· OpenAI	2.1	$0.05
29	Grok 4· xAI	2.1	$3.00
30	o3· OpenAI	2.1	$2.00
31	Claude 3.5 Sonnet· Anthropic	0.1	—
32	Claude Sonnet 4· Anthropic	0.1	$3.00
33	GLM 4.7· z-ai	0.1	$0.38
34	GPT-4.1· OpenAI	0.1	$2.00
35	Grok 3· xAI	0.1	$3.00
36	Kimi K2 Thinking· moonshotai	0.1	$0.60
37	Qwen3 235B A22B Thinking 2507· Alibaba Qwen	0.1	$0.15