Benchmark · KnowledgeCompetitive

JNLI

Updated 2025-01-20
Models tested
11
Top score
82.4
DeepSeek R1 Distill Qwen 14B
Median
60.9
min 35.6
Top-5 spread
σ 7.9
wide open

Best score over time · one chart, every benchmark

JNLI7 MODELS · FRONTIER RUNNING MAX0255075100SCORE ↑Jun 24Jul 24Sep 24Nov 24Jan 25RELEASE DATE →benchgecko.ai/benchmark/jp-jnli · frontier
Frontier on JNLI rose from 81.3 to 82.4 in 8 months · +1.1 points · latest leader DeepSeek R1 Distill Qwen 14B from DeepSeek.
Pink dots = frontier records · 2 totalClick to open model page

Same category · related evaluations