Name: SWE-Bench Pro (Private) Benchmark
Creator: BenchGecko
License: https://creativecommons.org/licenses/by/4.0/

Question 1

What does SWE-Bench Pro (Private) measure?

Accepted Answer

SWE-Bench Pro (Private) is a knowledge benchmark in the BenchGecko catalog. 2 AI models have been tested on it. Scores range from 10.1 to 23.4 out of 100.

Question 2

Which model leads on SWE-Bench Pro (Private)?

Accepted Answer

Claude Opus 4.5 from Anthropic leads SWE-Bench Pro (Private) with a score of 23.4. The median score across 2 tested models is 16.8.

Question 3

Is SWE-Bench Pro (Private) saturated?

Accepted Answer

No · the top score is 23.4 out of 100 (23%). There is still meaningful room for improvement on SWE-Bench Pro (Private).

Question 4

What makes SWE-Bench Pro (Private) distinctive?

Accepted Answer

SWE-Bench Pro (Private) is a knowledge benchmark with limited overlap to the rest of the catalog · it measures capabilities that are not well-covered by other benchmarks we track.

Question 5

How often is SWE-Bench Pro (Private) data refreshed?

Accepted Answer

BenchGecko pulls updates daily. New model scores on SWE-Bench Pro (Private) appear as soon as they are published by Epoch AI or the model provider.

#	Model	Score	Price	Bar
1	Claude Opus 4.5· Anthropic	23.4	$5.00
2	Gemini 2.5 Pro Preview 06-05· Google DeepMind	10.1	$1.25

SWE-Bench Pro (Private)

Full rankings

Score distribution

Correlated benchmarks

Frequently asked

Top on SWE-Bench Pro (Private)

Related topics

Compare models

More knowledge benchmarks