Kimi K2.5
by Moonshot AI
70.8
best score
70.8%
Best Score
bash-only
Best Leaderboard
1
Models Used
Yes
Open Source
Score History
| Entry | Score |
|---|---|
| Kimi K2.5 (high reasoning) | 70.8% |
| mini-SWE-agent + Kimi K2.5 (high reasoning) | 70.8% |
| Kimi K2.5 | 67.3% |
| Kimi K2 Thinking | 63.4% |
| Kimi K2 Instruct | 43.8% |