| 1 | GPT-5 Chat🇺🇸 OpenAI | 81.9 | 88.0 | - | - | - | - | $1.25 | 128K |
| 2 | Claude Mythos Preview🇺🇸 Anthropic | 81.8 | - | - | 93.9 | - | 82.0 | N/A | 1.0M |
| 3 | Gemini 2.5 Pro Preview 05-06🇺🇸 Google DeepMind | 76.9 | 76.9 | - | - | - | - | $1.25 | 1.0M |
| 4 | o4 Mini High🇺🇸 OpenAI | 72.0 | 72.0 | - | - | - | - | $1.10 | 200K |
| 5 | Grok 3 Beta🇺🇸 xAI | 69.5 | 53.3 | - | - | - | - | $3.00 | 131K |
| 6 | gpt-oss-120b (free)🇺🇸 OpenAIOpen | 68.7 | 41.8 | - | - | - | - | Free | 131K |
| 7 | Grok 3 Mini Beta🇺🇸 xAI | 64.8 | 49.3 | - | - | - | - | $0.30 | 131K |
| 8 | o3 Pro🇺🇸 OpenAI | 61.2 | 84.9 | - | - | - | - | $20.00 | 200K |
| 9 | Gemini 3.1 Pro Preview🇺🇸 Google DeepMind | 60.6 | - | - | 75.6 | - | 78.4 | $2.00 | 1.0M |
| 10 | Gemini 3 Pro🇺🇸 Google DeepMind | 60.5 | - | 18.6 | 72.9 | - | 69.4 | N/A | 0K |
| 11 | o3 Mini High🇺🇸 OpenAI | 60.4 | 60.4 | - | - | - | - | $1.10 | 200K |
| 12 | DeepSeek V3🇨🇳 DeepSeekOpen | 59.0 | 48.4 | - | - | - | - | $0.32 | 164K |
| 13 | GPT-5.4🇺🇸 OpenAI | 59.0 | - | - | 76.9 | - | - | $2.50 | 1.1M |
| 14 | Qwen3 32B🇨🇳 Alibaba QwenOpen | 58.2 | 40.0 | - | - | - | - | $0.08 | 41K |
| 15 | R1 0528🇨🇳 DeepSeekOpen | 57.9 | 71.4 | - | - | - | - | $0.50 | 164K |
| 16 | GLM 5🇨🇳 z-aiOpen | 57.6 | - | - | 72.1 | - | 52.4 | $0.72 | 80K |
| 17 | Claude Opus 4.6🇺🇸 Anthropic | 57.5 | - | 33.3 | 78.7 | - | 74.7 | $5.00 | 1.0M |
| 18 | o1🇺🇸 OpenAI | 56.4 | 61.7 | - | - | - | - | $15.00 | 200K |
| 19 | Qwen3 235B A22B🇨🇳 Alibaba QwenOpen | 56.4 | 59.6 | - | - | - | - | $0.46 | 131K |
| 20 | Gemini 2.5 Pro🇺🇸 Google DeepMind | 56.2 | 83.1 | 3.9 | 57.6 | - | 32.6 | $1.25 | 1.0M |
| 21 | Kimi K2 0711🇨🇳 moonshotaiOpen | 56.2 | 59.1 | 4.9 | - | - | 27.8 | $0.57 | 131K |
| 22 | GPT-5 Mini🇺🇸 OpenAI | 56.0 | - | - | 64.7 | 59.8 | 34.8 | $0.25 | 400K |
| 23 | o3🇺🇸 OpenAI | 55.2 | 81.3 | 8.8 | 62.3 | 58.4 | - | $2.00 | 200K |
| 24 | DeepSeek V3 0324🇨🇳 DeepSeekOpen | 55.1 | 55.1 | - | - | - | - | $0.20 | 164K |
| 25 | MiniMax M2.5🇨🇳 minimaxOpen | 55.1 | - | - | - | - | 42.2 | $0.12 | 197K |
| 26 | Grok 4🇺🇸 xAI | 54.8 | 79.6 | - | - | - | 27.2 | $3.00 | 256K |
| 27 | GPT-5🇺🇸 OpenAI | 54.4 | 88.0 | 6.9 | 73.5 | 65.0 | 49.6 | $1.25 | 400K |
| 28 | GPT-5.2🇺🇸 OpenAI | 54.0 | - | 27.4 | 73.8 | 71.8 | 64.9 | $1.75 | 400K |
| 29 | Gemini 2.0 Pro🇺🇸 Google DeepMind | 53.7 | 35.6 | - | - | - | - | N/A | 0K |
| 30 | Kimi K2 Thinking🇨🇳 moonshotaiOpen | 53.3 | - | - | - | 63.4 | 35.7 | $0.60 | 262K |
| 31 | DeepSeek V3.2 Exp🇨🇳 DeepSeekOpen | 53.2 | 74.2 | - | - | - | - | $0.27 | 164K |
| 32 | o4 Mini🇺🇸 OpenAI | 53.2 | 72.0 | 3.6 | - | 45.0 | - | $1.10 | 200K |
| 33 | Qwen2.5 Coder 32B Instruct🇨🇳 Alibaba QwenOpen | 53.1 | 16.4 | - | - | - | - | $0.66 | 33K |
| 34 | DeepSeek V3.2🇨🇳 DeepSeekOpen | 53.0 | 74.2 | - | - | - | 39.6 | $0.26 | 164K |
| 35 | GPT-5.3-Codex🇺🇸 OpenAI | 52.2 | - | - | 74.8 | - | 77.3 | $1.75 | 400K |
| 36 | Kimi K2.5🇨🇳 moonshotaiOpen | 52.0 | - | - | 73.8 | - | 43.2 | $0.38 | 262K |
| 37 | Gemini 2.5 Pro Preview 06-05🇺🇸 Google DeepMind | 50.9 | 83.1 | - | - | - | - | $1.25 | 1.0M |
| 38 | GLM 4.6🇨🇳 z-aiOpen | 50.8 | - | - | - | - | 24.5 | $0.39 | 205K |
| 39 | GLM 4.7🇨🇳 z-aiOpen | 50.5 | - | - | - | - | 33.4 | $0.39 | 203K |
| 40 | Grok 4 Fast🇺🇸 xAI | 50.4 | - | - | - | - | - | $0.20 | 2.0M |
| 41 | GPT-5.1🇺🇸 OpenAI | 49.6 | - | 13.7 | 68.0 | 66.0 | 47.6 | $1.25 | 400K |
| 42 | Gemini 3 Flash Preview🇺🇸 Google DeepMind | 49.1 | - | 9.8 | 75.4 | - | 64.3 | $0.50 | 1.0M |
| 43 | Qwen3 235B A22B Instruct 2507🇨🇳 Alibaba QwenOpen | 48.5 | 59.6 | - | - | - | - | $0.07 | 262K |
| 44 | Gemini 2.0 Flash🇺🇸 Google DeepMind | 48.0 | 38.2 | - | - | - | - | $0.10 | 1.0M |
| 45 | Claude 3.7 Sonnet🇺🇸 Anthropic | 47.7 | 64.9 | 3.8 | 61.0 | 52.8 | - | $3.00 | 200K |
| 46 | Claude Sonnet 4.6🇺🇸 Anthropic | 47.6 | - | - | 75.2 | - | - | $3.00 | 1.0M |
| 47 | gpt-oss-120b🇺🇸 OpenAIOpen | 46.9 | 41.8 | - | - | 26.0 | 18.7 | $0.04 | 131K |
| 48 | Grok 3 Mini🇺🇸 xAI | 46.6 | 49.3 | - | - | - | - | $0.30 | 131K |
| 49 | Claude Opus 4.5🇺🇸 Anthropic | 45.4 | - | 26.5 | 76.7 | 74.4 | 63.1 | $5.00 | 200K |
| 50 | GPT-5 Nano🇺🇸 OpenAI | 45.3 | - | - | - | 34.8 | 11.5 | $0.05 | 400K |
| 51 | R1🇨🇳 DeepSeekOpen | 45.1 | 56.9 | - | - | - | - | $0.70 | 64K |
| 52 | Claude Sonnet 4🇺🇸 Anthropic | 44.6 | 61.3 | 4.9 | - | 64.9 | - | $3.00 | 1.0M |
| 53 | GPT-4.1 Mini🇺🇸 OpenAI | 44.5 | 32.4 | - | - | 23.9 | - | $0.40 | 1.0M |
| 54 | GPT-4.1🇺🇸 OpenAI | 43.3 | 52.4 | - | 48.5 | 39.6 | - | $2.00 | 1.0M |
| 55 | GPT-4o-mini (2024-07-18)🇺🇸 OpenAI | 43.2 | 3.6 | - | - | - | - | $0.15 | 128K |
| 56 | Claude 3.5 Sonnet🇺🇸 Anthropic | 42.3 | 51.6 | 4.6 | - | - | - | N/A | 0K |
| 57 | Gemma 3 27B🇺🇸 Google DeepMindOpen | 42.2 | 4.9 | - | - | - | - | $0.08 | 131K |
| 58 | Gemma 3 27B (free)🇺🇸 Google DeepMindOpen | 42.2 | 4.9 | - | - | - | - | Free | 131K |
| 59 | Claude Sonnet 4.5🇺🇸 Anthropic | 42.1 | - | 14.7 | 71.3 | 70.6 | 46.5 | $3.00 | 1.0M |
| 60 | Claude Opus 4🇺🇸 Anthropic | 41.7 | 72.0 | 6.9 | 70.7 | 67.6 | - | $15.00 | 200K |
| 61 | o1-preview🇺🇸 OpenAI | 41.5 | - | - | - | - | - | N/A | 0K |
| 62 | Claude Opus 4.1🇺🇸 Anthropic | 41.3 | - | - | 73.3 | - | 38.0 | $15.00 | 200K |
| 63 | Gemini 1.5 Pro (Feb 2024)🇺🇸 Google DeepMind | 41.3 | - | - | - | - | - | N/A | 0K |
| 64 | Qwen2.5-Max🇨🇳 Alibaba QwenOpen | 41.0 | 21.8 | - | - | - | - | N/A | 0K |
| 65 | Gemini 2.5 Flash🇺🇸 Google DeepMind | 40.0 | 47.1 | - | - | - | 17.1 | $0.30 | 1.0M |
| 66 | GPT-4o-mini🇺🇸 OpenAI | 39.6 | 3.6 | - | - | - | - | $0.15 | 128K |
| 67 | Grok 3🇺🇸 xAI | 38.4 | 53.3 | - | - | - | - | $3.00 | 131K |
| 68 | o3 Mini🇺🇸 OpenAI | 38.4 | 60.4 | 1.3 | - | - | - | $1.10 | 200K |
| 69 | Llama 3.1 405B🇺🇸 MetaOpen | 38.0 | - | - | - | - | - | N/A | 0K |
| 70 | Gemini 2.0 Flash Thinking (Jan 2025)🇺🇸 Google DeepMind | 37.7 | 18.2 | - | - | - | - | N/A | 0K |
| 71 | GPT-4o (2024-11-20)🇺🇸 OpenAI | 37.7 | 23.1 | 0.1 | 31.0 | 21.6 | - | $2.50 | 128K |
| 72 | Claude 3.5 Haiku🇺🇸 Anthropic | 37.2 | 28.0 | - | - | - | - | $0.80 | 200K |
| 73 | Claude Haiku 4.5🇺🇸 Anthropic | 37.1 | - | - | - | - | 35.5 | $1.00 | 200K |
| 74 | GPT-4.5🇺🇸 OpenAI | 35.9 | 44.9 | - | - | - | - | N/A | 0K |
| 75 | GPT-4o (2024-08-06)🇺🇸 OpenAI | 35.6 | 23.1 | - | - | - | - | $2.50 | 128K |
| 76 | GPT-4.1 Nano🇺🇸 OpenAI | 35.2 | 8.9 | - | - | - | - | $0.10 | 1.0M |
| 77 | o1-mini🇺🇸 OpenAI | 34.9 | 32.9 | - | - | - | - | N/A | 0K |
| 78 | Claude 3 Opus🇺🇸 Anthropic | 33.7 | - | - | - | - | - | N/A | 0K |
| 79 | Llama 3 70B Instruct🇺🇸 MetaOpen | 32.4 | - | - | - | - | - | $0.51 | 8K |
| 80 | Claude 3 Haiku🇺🇸 Anthropic | 28.7 | - | - | - | - | - | $0.25 | 200K |
| 81 | Llama 4 Maverick🇺🇸 MetaOpen | 28.0 | 15.6 | - | - | 21.0 | - | $0.15 | 1.0M |
| 82 | Mixtral 8x22B Instruct🇫🇷 Mistral AIOpen | 23.5 | - | - | - | - | - | $2.00 | 66K |
| 83 | Llama 4 Scout🇺🇸 MetaOpen | 18.9 | - | - | - | 9.1 | - | $0.08 | 328K |
| 84 | QwQ 32B🇨🇳 Alibaba QwenOpen | 13.5 | 20.9 | - | - | - | - | $0.15 | 131K |