Name: SWE-Bench Pro (Public) Benchmark
Creator: BenchGecko
License: https://creativecommons.org/licenses/by/4.0/

Question 1

What does SWE-Bench Pro (Public) measure?

Accepted Answer

SWE-Bench Pro (Public) is a knowledge benchmark in the BenchGecko catalog. 2 AI models have been tested on it. Scores range from 41.0 to 45.9 out of 100.

Question 2

Which model leads on SWE-Bench Pro (Public)?

Accepted Answer

Claude Opus 4.5 from Anthropic leads SWE-Bench Pro (Public) with a score of 45.9. The median score across 2 tested models is 43.5.

Question 3

Is SWE-Bench Pro (Public) saturated?

Accepted Answer

No · the top score is 45.9 out of 100 (46%). There is still meaningful room for improvement on SWE-Bench Pro (Public).

Question 4

What makes SWE-Bench Pro (Public) distinctive?

Accepted Answer

SWE-Bench Pro (Public) is a knowledge benchmark with limited overlap to the rest of the catalog · it measures capabilities that are not well-covered by other benchmarks we track.

Question 5

How often is SWE-Bench Pro (Public) data refreshed?

Accepted Answer

BenchGecko pulls updates daily. New model scores on SWE-Bench Pro (Public) appear as soon as they are published by Epoch AI or the model provider.

#	Model	Score	Price	Bar
1	Claude Opus 4.5· Anthropic	45.9	$5.00
2	GPT-5.2-Codex· OpenAI	41.0	$1.75

SWE-Bench Pro (Public)

Full rankings

Score distribution

Correlated benchmarks

Frequently asked

Top on SWE-Bench Pro (Public)

Related topics

Compare models

More knowledge benchmarks