Beta
Benchmark · Knowledge

OpenCompass · HLE

Updated 2026-02-16
Models tested
32
Top score
28.6
DeepSeek V3.2 Speciale
Median
13.9
min 4.2
Top-5 spread
σ 1.2
settled

Best score over time · one chart, every benchmark

OPENCOMPASS · HLE32 MODELS · FRONTIER RUNNING MAX0255075100SCORE ↑Mar 25Jun 25Aug 25Nov 25Feb 26RELEASE DATE →benchgecko.ai/benchmark/oc-hle · frontier
Frontier on OpenCompass · HLE rose from 4.2 to 28.6 in 9 months · +24.4 points · latest leader DeepSeek V3.2 Speciale from DeepSeek.
Pink dots = frontier records · 8 totalClick to open model page

Where models cluster

SCORE DISTRIBUTION110–101010–201120–3030–4040–5050–6060–7070–8080–9090–100MEDIAN · 13.9SCORE BUCKET → (0 TO 100)MODELSbenchgecko.ai

Pearson r · original research

32 models tested · sorted by score

Pulled from the OpenCompass · HLE dataset · updated daily

What does OpenCompass · HLE measure?

OpenCompass · HLE is a knowledge benchmark in the BenchGecko catalog. 32 AI models have been tested on it. Scores range from 4.2 to 28.6 out of 100.

Which model leads on OpenCompass · HLE?

DeepSeek V3.2 Speciale from DeepSeek leads OpenCompass · HLE with a score of 28.6. The median score across 32 tested models is 13.9.

Is OpenCompass · HLE saturated?

No · the top score is 28.6 out of 100 (29%). There is still meaningful room for improvement on OpenCompass · HLE.

Does OpenCompass · HLE predict performance on other benchmarks?

Yes · OpenCompass · HLE scores correlate 0.98 with Artificial Analysis · Coding Index across 11 shared models. Models that do well on OpenCompass · HLE tend to do well on Artificial Analysis · Coding Index.

How often is OpenCompass · HLE data refreshed?

BenchGecko pulls updates daily. New model scores on OpenCompass · HLE appear as soon as they are published by Epoch AI or the model provider.

Same category · related evaluations