Benchmark · KnowledgeCompetitive

MUSR

Updated 2025-07-24
Models tested
73
Top score
28.7
DeepSeek R1 Distill Qwen 14B
Median
9.7
min 0.5
Top-5 spread
σ 3.7
Competitive

Best score over time · one chart, every benchmark

MUSR44 MODELS · FRONTIER RUNNING MAX0255075100SCORE ↑May 24Sep 24Dec 24Apr 25Jul 25RELEASE DATE →benchgecko.ai/benchmark/hf-musr · frontier
Frontier on MUSR rose from 23.4 to 28.7 in 5 months · +5.3 points · latest leader DeepSeek R1 Distill Qwen 14B from DeepSeek.
Pink dots = frontier records · 2 totalClick to open model page

73 models tested · sorted by score

#ModelScore
1DeepSeek logoDeepSeek R1 Distill Qwen 14B28.7
2nousresearch logoHermes 3 70B Instruct23.4
3Meta logoLlama 3 8B Instruct19.9
4Alibaba Qwen logoQwen2-72B19.7
5
U
Stable Beluga 2
18.6
6Meta logoLlama 3.1 70B Instruct17.7
7
D
Dolphin 2.9.1 Yi 1.5 34b
17.0
8DeepSeek logoR1 Distill Qwen 32B16.1
9Meta logoMeta Llama 3 8B16.0
10Meta logoLlama 3.3 70B Instruct15.6
11OpenAI logoGpt215.3
12Microsoft logoWizardLM-2 8x22B14.5
13z-ai logoGLM 4 32B 14.2
14Microsoft logoPhi 213.8
15Alibaba Qwen logoQwen2.5 Coder 32B Instruct13.7
16Alibaba logoQwen2 VL 7B Instruct13.6
17Alibaba logoQwen2.5 32B Instruct13.5
18anthracite-org logoMagnum v4 72B13.4
19DeepSeek logoR1 Distill Llama 70B13.3
20Microsoft logoPhi 3 Mini 4k Instruct13.1
21
HA
Qwen2.5 72B Instruct Abliterated
12.3
22Alibaba logoQwen2 1.5B Instruct12.0
23Alibaba Qwen logoQwen2.5 72B Instruct11.7
24
L
Vicuna 7b V1.5
11.4
25Google DeepMind logoGemma 2 2b11.3
26nousresearch logoHermes 2 Pro - Llama-3 8B11.3
27
D
Distilgpt2
11.2
28Alibaba Qwen logoQwQ 32B11.1
29Google DeepMind logoGemma 2B11.0
30Mistral AI logoMistral 7B V0.110.7
31eleutherai logoPythia 160m10.7
32Alibaba logoQwen2.5 14B Instruct10.6
33Microsoft logoPhi 410.1
34Microsoft logoPhi 3.5 Mini Instruct10.1
35
HF
SmolLM2 135M
10.0
36Google DeepMind logoGemma 2 9B9.7
37
U
Yi 6B
9.7
38Alibaba Qwen logoQwen2.5 Coder 7B Instruct9.5
39Google DeepMind logoGemma 2 27B9.1
40Meta logoLlama 3.1 8B Instruct8.5
41Alibaba Qwen logoQwen2.5 7B Instruct8.4
42Mistral AI logoMistral 7B Instruct V0.27.6
43Alibaba logoQwen2.5 3B Instruct7.6
44TII logoFalcon-180B7.5
45Alibaba logoQwen2 7B Instruct7.4
46Google DeepMind logoGemma 2 2b It7.1
47Alibaba logoQwen2.5 Coder 14B Instruct7.0
48Microsoft logoPhi 4 Mini Instruct6.5
49OpenAI logoGpt2 Medium6.2
50Mistral AI logoMistral 7B Instruct v0.16.1
51OpenAI logoGpt2 Large5.7
52Alibaba logoQwen2 0.5B4.6
53
T
TinyLlama 1.1B Chat V1.0
4.3
54
U
INTELLECT-1
4.1
55Meta logoLlama 3.2 3B Instruct (free)3.8
56Meta logoLlama 2 7b Hf3.8
57
HF
SmolLM2 135M Instruct
3.7
58DeepSeek logoDeepSeek R1 Distill Qwen 7B3.5
59Microsoft logoPhi-1.53.4
60Meta logoLlama 2 7b Chat Hf3.3
61Alibaba logoQwen2.5 1.5B Instruct3.2
62DeepSeek logoDeepSeek R1 Distill Qwen 1.5B3.0
63
U
StarCoder 2 15B
2.9
64
U
MPT-30B
2.9
65eleutherai logoGpt Neo 125m2.6
66Alibaba logoQwen2 0.5B Instruct2.4
67Meta logoLlama 3.1 405B2.2
68Meta logoLLaMA-13B2.0
69Meta logoLlama 3.2 1B Instruct1.9
70Meta logoMeta Llama 3 8B Instruct1.6
71Meta logoLlama 3.2 3B Instruct1.4
72Alibaba logoQwen2.5 0.5B Instruct1.4
73DeepSeek logoDeepSeek R1 Distill Llama 8B0.5

Same category · related evaluations