Agentic Tasks
Autonomous task execution, tool use, and multi-step planning β the frontier of AI agents working independently.
33
Models Ranked
62.9
Top Score
20.5
Average Score
3
Benchmarks
Benchmarks in This Skill
Rankings
| # | Model | Avg Score | Bar |
|---|---|---|---|
| 1 | 62.9 | ||
| 2 | 42.9 | ||
| 3 | 42.3 | ||
| 4 | M Kimi K2.5moonshotai | 38.9 | |
| 5 | 38.5 | ||
| 6 | 35.9 | ||
| 7 | 34.3 | ||
| 8 | 34.3 | ||
| 9 | 33.5 | ||
| 10 | 33.3 | ||
| 11 | 33.3 | ||
| 12 | 31.7 | ||
| 13 | 24.0 | ||
| 14 | 23.0 | ||
| 15 | 18.4 | ||
| 16 | 18.3 | ||
| 17 | 18.3 | ||
| 18 | 17.5 | ||
| 19 | 17.5 | ||
| 20 | 15.2 | ||
| 21 | 8.6 | ||
| 22 | 7.4 | ||
| 23 | 6.9 | ||
| 24 | 5.3 | ||
| 25 | 4.7 | ||
| 26 | 4.7 | ||
| 27 | 4.7 | ||
| 28 | 4.7 | ||
| 29 | M Kimi K2 Thinkingmoonshotai | 4.0 | |
| 30 | 3.4 | ||
| 31 | ZA GLM 4.7z-ai | 3.1 | |
| 32 | ZA GLM 4.6z-ai | 3.0 | |
| 33 | 1.1 |