Best Open-weight AI Models
A ranked view of models with available weights. The page says open-weight because licensing terms vary, and missing license metadata is shown plainly instead of assumed.
R1 0528
DeepSeek
R1
DeepSeek
DeepSeek V3
DeepSeek
Ranked model table
Scores are based on the visible benchmark set and available metadata.
| Rank | Model | Score | Evidence | Input price | Context |
|---|---|---|---|---|---|
| #1 | R1 0528 DeepSeek | 83.6 | 3 benchmarks · Medium | $0.50/M | 164K |
| #2 | R1 DeepSeek | 73.9 | 3 benchmarks · Medium | $0.70/M | 64K |
| #3 | DeepSeek V3 DeepSeek | 71.1 | 5 benchmarks · High | $0.32/M | 164K |
| #4 | Qwen2.5 72B Instruct Alibaba Qwen | 69.7 | 6 benchmarks · High | $0.36/M | 33K |
| #5 | Phi 4 Microsoft | 69.4 | 4 benchmarks · High | $0.07/M | 16K |
| #6 | Qwen3 235B A22B Alibaba Qwen | 66.5 | 3 benchmarks · Medium | $0.46/M | 131K |
| #7 | Qwen2.5 Coder 32B Instruct Alibaba Qwen | 61.4 | 4 benchmarks · High | $0.66/M | 33K |
| #8 | Llama 3.1 405B Meta | 59.4 | 5 benchmarks · High | Not listed/M | Not listed |
| #9 | Qwen2.5 Coder 7B Instruct Alibaba Qwen | 54.7 | 3 benchmarks · Medium | $0.03/M | 33K |
| #10 | Llama 3.1 70B Instruct Meta | 54.2 | 4 benchmarks · High | $0.40/M | 131K |
| #11 | Llama 3.3 70B Instruct (free) Meta | 51.7 | 3 benchmarks · Medium | Not listed/M | 66K |
| #12 | Mistral Large 2407 Mistral AI | 47.8 | 3 benchmarks · Medium | $2.00/M | 131K |
Open-weight does not automatically mean free, unrestricted, or commercially usable. Check the listed license and provider terms before deployment.
BenchGecko ranks models from published benchmark scores and model metadata. Scores do not measure every use case, and missing data can affect rankings.
Best AI Models for Coding
Coding models ranked from published coding benchmark scores, listed prices, and model metadata tracked by BenchGecko.
Best AI Models for Reasoning
Reasoning models ranked from public benchmark scores across GPQA Diamond, BBH, ARC-AGI, SimpleBench, and related tests.
Best AI Models for Math
Math models ranked from public benchmark scores across GSM8K, MATH-level tests, AIME-style tasks, and FrontierMath where available.
Questions
Why does this page say open-weight instead of open-source?
Model weights can be available while license terms still vary. Open-weight is a more precise public label unless the license is fully verified.
Are open-weight models free to run?
The model file may be available, but inference still has compute cost. BenchGecko shows pricing metadata when it is listed.
How are open-weight models ranked?
Models need multiple published benchmark scores. Scores are normalized across benchmarks, then adjusted slightly for evidence coverage.