Beta
Benchmark · Knowledge

OpenCompass · IFEval

Updated 2026-02-16
Models tested
32
Top score
93.9
Kimi K2.5
Median
89.2
min 60.3
Top-5 spread
σ 0.8
settled

Best score over time · one chart, every benchmark

OPENCOMPASS · IFEVAL32 MODELS · FRONTIER RUNNING MAX0255075100SCORE ↑Mar 25Jun 25Aug 25Nov 25Feb 26RELEASE DATE →benchgecko.ai/benchmark/oc-ifeval · frontier
Frontier on OpenCompass · IFEval rose from 81.0 to 93.9 in 11 months · +12.9 points · latest leader Kimi K2.5 from moonshotai.
Pink dots = frontier records · 7 totalClick to open model page

Where models cluster

SCORE DISTRIBUTION0–1010–2020–3030–4040–5050–60160–7070–801980–901290–100MEDIAN · 89.2SCORE BUCKET → (0 TO 100)MODELSbenchgecko.ai

Pearson r · original research

Correlation analysis

Benchmarks that track with OpenCompass · IFEval

Pearson correlation across models scored on both benchmarks. Closer to 1 = strongly predictive.

32 models tested · sorted by score

Pulled from the OpenCompass · IFEval dataset · updated daily

What does OpenCompass · IFEval measure?

OpenCompass · IFEval is a knowledge benchmark in the BenchGecko catalog. 32 AI models have been tested on it. Scores range from 60.3 to 93.9 out of 100.

Which model leads on OpenCompass · IFEval?

Kimi K2.5 from moonshotai leads OpenCompass · IFEval with a score of 93.9. The median score across 32 tested models is 89.2.

Is OpenCompass · IFEval saturated?

No · the top score is 93.9 out of 100 (94%). There is still meaningful room for improvement on OpenCompass · IFEval.

Does OpenCompass · IFEval predict performance on other benchmarks?

Yes · OpenCompass · IFEval scores correlate 0.90 with LiveBench · Overall across 10 shared models. Models that do well on OpenCompass · IFEval tend to do well on LiveBench · Overall.

How often is OpenCompass · IFEval data refreshed?

BenchGecko pulls updates daily. New model scores on OpenCompass · IFEval appear as soon as they are published by Epoch AI or the model provider.

Same category · related evaluations