Which model leads on FrontierMath-2025-02-28-Private?

GPT-5.4 Pro from OpenAI leads FrontierMath-2025-02-28-Private with a score of 50.0. The median score across 54 tested models is 6.6.

Is FrontierMath-2025-02-28-Private saturated?

No · the top score is 50.0 out of 100 (50%). There is still meaningful room for improvement on FrontierMath-2025-02-28-Private.

Does FrontierMath-2025-02-28-Private predict performance on other benchmarks?

Yes · FrontierMath-2025-02-28-Private scores correlate 0.94 with Artificial Analysis · Quality Index across 12 shared models. Models that do well on FrontierMath-2025-02-28-Private tend to do well on Artificial Analysis · Quality Index.

How often is FrontierMath-2025-02-28-Private data refreshed?

BenchGecko pulls updates daily. New model scores on FrontierMath-2025-02-28-Private appear as soon as they are published by Epoch AI or the model provider.

Benchmark · MathCompetitive

FrontierMath-2025-02-28-Private

Name: FrontierMath-2025-02-28-Private Benchmark
Creator: BenchGecko
License: https://creativecommons.org/licenses/by/4.0/

FrontierMath (Feb 2025) · original research-level math problems created by mathematicians, testing capabilities at the boundary of current AI mathematical reasoning.

Updated 2026-03-05

Models tested

Top score

50.0

GPT-5.4 Pro

Median

6.6

min 0.1

Top-5 spread

σ 4.4

Competitive

The Frontier

Best score over time · one chart, every benchmark

Chart type

Frontier on FrontierMath-2025-02-28-Private rose from 9.3 to 50.0 in 15 months · +40.7 points · latest leader GPT-5.4 Pro from OpenAI.

Pink dots = frontier records · 8 totalClick to open model page

Full rankings

54 models tested · sorted by score

#	Model	Score	Price
1	GPT-5.4 Pro· OpenAI	50.0	$30.00
2	GPT-5.4· OpenAI	47.6	$2.50
3	Claude Opus 4.6· Anthropic	40.7	$5.00
4	GPT-5.2· OpenAI	40.7	$1.75
5	U Muse Spark· Unknown	39.0	—
6	Gemini 3 Pro· Google DeepMind	37.6	—
7	Gemini 3.1 Pro Preview· Google DeepMind	36.9	$2.00
8	Gemini 3 Flash Preview· Google DeepMind	35.6	$0.50
9	GPT-5· OpenAI	32.4	$1.25
10	Claude Sonnet 4.6· Anthropic	32.4	$3.00
11	GPT-5.1· OpenAI	31.0	$1.25
12	Kimi K2.5· moonshotai	27.9	$0.44
13	GPT-5 Mini· OpenAI	27.2	$0.25
14	o4 Mini· OpenAI	24.8	$1.10
15	DeepSeek V3.2· DeepSeek	22.1	$0.25
16	Kimi K2 Thinking· moonshotai	21.4	$0.60
17	Claude Opus 4.5· Anthropic	20.7	$5.00
18	Grok 4· xAI	19.7	$3.00
19	o3· OpenAI	18.7	$2.00
20	GLM 5· z-ai	16.4	$0.60
21	Claude Sonnet 4.5· Anthropic	15.2	$3.00
22	Gemini 2.5 Pro· Google DeepMind	14.1	$1.25
23	o3 Mini· OpenAI	12.4	$1.10
24	o1· OpenAI	9.3	$15.00
25	Qwen3 235B A22B Thinking 2507· Alibaba Qwen	8.5	$0.15
26	GPT-5 Nano· OpenAI	8.3	$0.05
27	Claude Opus 4.1· Anthropic	7.2	$15.00
28	Claude Haiku 4.5· Anthropic	5.9	$1.00
29	Grok 3 Mini· xAI	5.9	$0.30
30	GPT-4.1· OpenAI	5.5	$2.00
31	Gemini 2.5 Flash· Google DeepMind	4.8	$0.30
32	Claude Opus 4· Anthropic	4.5	$15.00
33	GPT-4.1 Mini· OpenAI	4.5	$0.40
34	Claude 3.7 Sonnet· Anthropic	4.1	$3.00
35	Claude Sonnet 4· Anthropic	4.1	$3.00
36	GLM 4.6· z-ai	3.8	$0.39
37	Grok 3· xAI	3.8	$3.00
38	GLM 4.7· z-ai	2.4	$0.38
39	DeepSeek V3· DeepSeek	1.7	$0.32
40	Gemini 2.0 Flash· Google DeepMind	1.7	$0.10
41	o1-mini· OpenAI	1.7	—
42	Claude 3.5 Sonnet· Anthropic	1.0	—
43	GPT-4.1 Nano· OpenAI	1.0	$0.10
44	Qwen2.5-Max· Alibaba Qwen	1.0	—
45	Grok-2 (Dec 2024)· xAI	0.7	—
46	Llama 4 Maverick· Meta	0.7	$0.15
47	Mistral Medium 3· Mistral AI	0.3	$0.40
48	Claude 3.5 Haiku· Anthropic	0.3	$0.80
49	GPT-4o (2024-08-06)· OpenAI	0.3	$2.50
50	GPT-4o (2024-11-20)· OpenAI	0.3	$2.50
51	Mistral Large· Mistral AI	0.3	$2.00
52	Mistral Large 2411· Mistral AI	0.3	$2.00
53	Gemini 1.5 Flash (May 2024)· Google DeepMind	0.1	—
54	Llama 4 Scout· Meta	0.1	$0.08