API
Benchmarks/OTIS Mock AIME 2024-2025

OTIS Mock AIME 2024-2025

OTIS Mock AIME 2024–2025 β€” simulated American Invitational Mathematics Examination problems testing advanced problem-solving skills.

105
Models Tested
96.1
Top Score
42.9
Average Score
1OpenAIOpenAI96.1
2OpenAIOpenAI96.1
3Google DeepMindGoogle DeepMind95.6
4OpenAIOpenAI95.3
5AnthropicAnthropic94.4
6Google DeepMindGoogle DeepMind92.8
7
M
moonshotai
92.2
8OpenAIOpenAI91.4
9OpenAIOpenAI91.4
10GoogleGoogle91.4
11OpenAIOpenAI88.9
12OpenAIOpenAI88.9
13OpenAIOpenAI88.9
14OpenAIOpenAI88.9
15OpenAIOpenAI88.6
16OpenAIOpenAI88.6
17DeepSeekDeepSeek87.8
18OpenAIOpenAI86.7
19Alibaba QwenAlibaba Qwen86.7
20AnthropicAnthropic86.1
21AnthropicAnthropic85.8
22Google DeepMindGoogle DeepMind84.7
23xAIxAI84.0
24OpenAIOpenAI83.9
25
ZA
z-ai
83.3
26
M
moonshotai
83.0
27OpenAIOpenAI81.7
28OpenAIOpenAI81.1
29
ZA
z-ai
80.0
30
ZA
z-ai
80.0
31AnthropicAnthropic77.8
32xAIxAI77.8
33xAIxAI77.8
34OpenAIOpenAI76.9
35Alibaba QwenAlibaba Qwen73.3
36OpenAIOpenAI73.3
37Google DeepMindGoogle DeepMind73.0
38AnthropicAnthropic71.1
39AnthropicAnthropic68.9
40AnthropicAnthropic66.6
41DeepSeekDeepSeek66.4
42AnthropicAnthropic64.4
43AnthropicAnthropic57.7
44AnthropicAnthropic57.7
45GoogleGoogle57.7
46xAIxAI55.5
47xAIxAI55.5
48DeepSeekDeepSeek53.3
49OpenAIOpenAI46.9
50OpenAIOpenAI44.7
51OpenAIOpenAI38.3
52OpenAIOpenAI37.7
53Mistral AIMistral AI32.1
54OpenAIOpenAI31.0
55OpenAIOpenAI28.8
56GoogleGoogle23.0
57MetaMeta20.5
58Google DeepMindGoogle DeepMind19.6
59Google DeepMindGoogle DeepMind19.6
60Google DeepMindGoogle DeepMind19.6
61Google DeepMindGoogle DeepMind19.6
62Google DeepMindGoogle DeepMind19.6
63Google DeepMindGoogle DeepMind19.6
64GoogleGoogle16.2
65AlibabaAlibaba16.0
66DeepSeekDeepSeek15.8
67MicrosoftMicrosoft13.7
68xAIxAI11.4
69MetaMeta9.6
70Mistral AIMistral AI8.4
71Alibaba QwenAlibaba Qwen8.0
72MetaMeta7.7
73Mistral AIMistral AI7.7
74OpenAIOpenAI6.8
75OpenAIOpenAI6.8
76GoogleGoogle6.7
77OpenAIOpenAI6.6
78AnthropicAnthropic6.4
79OpenAIOpenAI6.3
80OpenAIOpenAI6.3
81OpenAIOpenAI6.3
82OpenAIOpenAI6.3
83OpenAIOpenAI6.2
84MetaMeta5.0
85MetaMeta5.0
86AnthropicAnthropic4.6
87AnthropicAnthropic4.2
88MetaMeta4.2
89GoogleGoogle3.8
90MetaMeta3.5
91MetaMeta2.5
92MetaMeta2.4
93AnthropicAnthropic2.4
94AnthropicAnthropic2.4
95Mistral AIMistral AI1.9
96AnthropicAnthropic1.9
97AnthropicAnthropic1.7
98Google DeepMindGoogle DeepMind1.3
99OpenAIOpenAI1.0
100OpenAIOpenAI1.0
101OpenAIOpenAI1.0
102OpenAIOpenAI1.0
103GoogleGoogle1.0
104MetaMeta0.7
105Google DeepMindGoogle DeepMind0.5