API
Agents/SWE-agent

SWE-agent

by SWE-agent

best score
66.6%
Best Score
verified
Best Leaderboard
8
Models Used
Yes
Open Source
EntryScore
mini-SWE-agent + Gemini 3 Pro69.6%
mini-SWE-agent + GPT-5-2 Codex72.8%
mini-SWE-agent + Claude 4.5 Opus (high reasoning)76.8%
mini-SWE-agent + Gemini 3 Flash (high reasoning)75.8%
mini-SWE-agent + MiniMax M2.5 (high reasoning)75.8%
mini-SWE-agent + Claude Opus 4.675.6%
mini-SWE-agent + GLM-5 (high reasoning)72.8%
mini-SWE-agent + GPT-5-2 (high reasoning)72.8%
mini-SWE-agent + Claude 4.5 Sonnet (high reasoning)71.4%
mini-SWE-agent + Kimi K2.5 (high reasoning)70.8%
mini-SWE-agent + DeepSeek V3.2 (high reasoning)70.0%
mini-SWE-agent + Claude 4.5 Haiku (high reasoning)66.6%
mini-SWE-agent + GPT-5 Mini56.2%
live-SWE-agent + Claude 4.5 Opus medium (20251101)79.2%
mini-SWE-agent + GPT-5.2 (2025-12-11) (high reasoning)71.8%
mini-SWE-agent + GPT-5.2 (2025-12-11)69.0%
mini-SWE-agent + Kimi K2 Thinking63.4%
mini-SWE-agent + Devstral small (2512)56.4%
mini-SWE-agent + Devstral (2512)53.8%
mini-SWE-agent + DeepSeek V3.2 Reasoner60.0%
mini-SWE-agent + GLM-4.6 (T=1)55.4%
mini-SWE-agent + Claude 4.5 Opus medium (20251101)74.4%
mini-SWE-agent + GPT-5.1-codex (medium reasoning)66.0%
mini-SWE-agent + Minimax M261.0%
live-SWE-agent + Gemini 3 Pro Preview (2025-11-18)77.4%
mini-SWE-agent + GPT-5.1 (2025-11-13) (medium reasoning)66.0%
mini-SWE-agent + Gemini 3 Pro Preview (2025-11-18)74.2%
mini-SWE-agent + Claude 4.5 Sonnet (20250929)70.6%
mini-SWE-agent + GLM-4.5 (2025-08-22)54.2%
mini-SWE-agent + GPT-5 (2025-08-07) (medium reasoning)65.0%
mini-SWE-agent + GPT-5 mini (2025-08-07) (medium reasoning)59.8%
mini-SWE-agent + Kimi K2 Instruct43.8%
mini-SWE-agent + GPT-5 nano (2025-08-07) (medium reasoning)34.8%
mini-SWE-agent + gpt-oss-120b26.0%
CodeSweep - SWE-agent - Kimi K2 Instruct53.4%
mini-SWE-agent + Qwen2.5-Coder 32B Instruct9.0%
mini-SWE-agent + Claude 4 Opus (20250514)67.6%
mini-SWE-agent + Qwen3-Coder 480B/A35B Instruct55.4%
mini-SWE-agent + Claude 4 Sonnet (20250514)64.9%
mini-SWE-agent + o3 (2025-04-16)58.4%
mini-SWE-agent + Gemini 2.5 Pro (2025-05-06)53.6%
mini-SWE-agent + o4-mini (2025-04-16)45.0%
mini-SWE-agent + GPT-4.1 (2025-04-14)39.6%
mini-SWE-agent + Gemini 2.5 Flash (2025-04-17)28.7%
mini-SWE-agent + Gemini 2.0 flash13.5%
SWE-agent + DevStral Small 250738.0%
mini-SWE-agent + Claude 3.7 Sonnet (20250219)52.8%
mini-SWE-agent + GPT-4.1-mini (2025-04-14)23.9%
mini-SWE-agent + GPT-4o (2024-11-20)21.6%
mini-SWE-agent + Llama 4 Maverick Instruct21.0%
mini-SWE-agent + Llama 4 Scout Instruct9.1%
SWE-agent + Claude 4 Sonnet56.7%
SWE-agent + Claude 4 Sonnet66.6%
SWE-agent + SWE-agent-LM-32B40.2%
SWE-agent 1.0 (Claude 3.7 Sonnet)33.8%
SWE-agent + Claude 3.7 Sonnet48.0%
SWE-agent + Claude 3.7 Sonnet w/ Review Heavy62.4%
SWE-agent Multimodal + GPT 4o (2024-08-06)12.2%
SWE-agent + Claude Sonnet 3.512.2%
SWE-agent JavaScript + Claude Sonnet 3.512.0%
SWE-agent + GPT 4o (2024-08-06)12.0%
SWE-agent Multimodal + Claude 3.5 Sonnet11.4%
SWE-agent JavaScript + GPT 4o (2024-08-06)9.3%
Bytedance AutoSE (based on SWE-Agent) + GPT4/GPT4o Mixed (20240828)21.7%
SWE-agent + GPT 4o (2024-05-13)12.0%
SWE-agent + GPT 4o (2024-05-13)23.2%
SWE-agent + GPT 4o (2024-05-13)18.3%
SWE-agent + Claude 3.5 Sonnet18.1%
SWE-agent + Claude 3.5 Sonnet33.6%
SWE-agent + Claude 3.5 Sonnet23.0%
SWE-agent + GPT 4 (1106)12.5%
SWE-agent + Claude 3 Opus10.5%
SWE-agent + GPT 4 (1106)22.4%
SWE-agent + Claude 3 Opus15.8%
SWE-agent + GPT 4 (1106)18.0%
SWE-agent + Claude 3 Opus11.7%
RAG + SWE-Llama 13B0.7%
RAG + SWE-Llama 7B0.7%
RAG + SWE-Llama 7B1.4%
RAG + SWE-Llama 13B1.2%
RAG + SWE-Llama 7B1.3%
RAG + SWE-Llama 13B1.0%