Beta
Licensing · Open source

Open source LLM pricing

Every open-source LLM with license, cheapest API provider, and rough self-host cost. Weights on HuggingFace, API on OpenRouter, DeepInfra, Together, Fireworks, or self-host.

Models212
Cheapest API$0.00
LicensesApache · MIT · Custom
What this page is
This page lists every open-source LLM with priced API access. For each, we show the license, cheapest API on the market, and a rough self-host cost estimate based on parameter count. Weights are downloadable from HuggingFace. APIs are offered by OpenRouter, DeepInfra, Together, Fireworks, Groq, Cerebras, and others. For very high volume, self-hosting on reserved GPUs is cheaper; for anything under 100M tokens per day, API access usually wins on total cost.

Cheapest input price first.

#ModelCheapest API
1Gemma 3 12B (free)$0.00/M
2Gemma 3 27B (free)$0.00/M
3Gemma 3 4B (free)$0.00/M
4Gemma 3n 2B (free)$0.00/M
5Gemma 3n 4B (free)$0.00/M
6Gemma 4 26B A4B (free)$0.00/M
7Gemma 4 31B (free)$0.00/M
8GLM 4.5 Air (free)$0.00/M
9gpt-oss-120b (free)$0.00/M
10gpt-oss-20b (free)$0.00/M
11Hermes 3 405B Instruct (free)$0.00/M
12LFM2.5-1.2B-Instruct (free)$0.00/M
13LFM2.5-1.2B-Thinking (free)$0.00/M
14Llama 3.2 3B Instruct (free)$0.00/M
15Llama 3.3 70B Instruct (free)$0.00/M
16MiniMax M2.5 (free)$0.00/M
17Mistral Small 3.1 24B (free)$0.00/M
18Nemotron 3 Nano 30B A3B (free)$0.00/M
19Nemotron 3 Super (free)$0.00/M
20Nemotron Nano 12B 2 VL (free)$0.00/M
21Nemotron Nano 9B V2 (free)$0.00/M
22Qwen3 4B (free)$0.00/M
23Qwen3 Coder 480B A35B (free)$0.00/M
24Qwen3 Next 80B A3B Instruct (free)$0.00/M
25Qwen3.6 Plus Preview (free)$0.00/M
26Step 3.5 Flash (free)$0.00/M
27Trinity Large Preview (free)$0.00/M
28Trinity Mini (free)$0.00/M
29Uncensored (free)$0.00/M
30LFM2-2.6B$0.01/M
31LFM2-8B-A1B$0.01/M
32Granite 4.0 Micro$0.02/M
33Gemma 3n 4B$0.02/M
34Llama 3.1 8B Instruct$0.02/M
35Mistral Nemo$0.02/M
36Llama 3.2 1B Instruct$0.03/M
37Gemma 2 9B$0.03/M
38gpt-oss-20b$0.03/M
39LFM2-24B-A2B$0.03/M
40Llama 3 8B Instruct$0.03/M
41Qwen2.5 Coder 7B Instruct$0.03/M
42Qwen-Turbo$0.03/M
43gpt-oss-120b$0.04/M
44Gemma 3 12B$0.04/M
45Gemma 3 4B$0.04/M
46Llama 3 8B Lunaris$0.04/M
47Nemotron Nano 9B V2$0.04/M
48Qwen2.5 7B Instruct$0.04/M
49Trinity Mini$0.04/M
50Mistral Small 3$0.05/M
51Nemotron 3 Nano 30B A3B$0.05/M
52Olmo 2 32B Instruct$0.05/M
53Qwen3 8B$0.05/M
54Qwen3.5-9B$0.05/M
55Llama 3.2 3B Instruct$0.05/M
56GLM 4.7 Flash$0.06/M
57MythoMax 13B$0.06/M
58Qwen3 14B$0.06/M
59Phi 4$0.07/M
60Qwen3.5-Flash$0.07/M
61ERNIE 4.5 21B A3B$0.07/M
62ERNIE 4.5 21B A3B Thinking$0.07/M
63Qwen3 Coder 30B A3B Instruct$0.07/M
64Qwen3 235B A22B Instruct 2507$0.07/M
65gpt-oss-safeguard-20b$0.07/M
66Mistral Small 3.2 24B$0.07/M
67Gemma 3 27B$0.08/M
68Gemma 4 26B A4B $0.08/M
69Llama 4 Scout$0.08/M
70Qwen3 30B A3B$0.08/M
71Qwen3 30B A3B Thinking 2507$0.08/M
72Qwen3 32B$0.08/M
73Qwen3 VL 8B Instruct$0.08/M
74MiMo-V2-Flash$0.09/M
75Qwen3 30B A3B Instruct 2507$0.09/M
76Qwen3 Next 80B A3B Instruct$0.09/M
77Tongyi DeepResearch 30B A3B$0.09/M
78Qwen3 Next 80B A3B Thinking$0.10/M
79Devstral Small 1.1$0.10/M
80Llama 3.3 70B Instruct$0.10/M
81Llama 3.3 Nemotron Super 49B V1.5$0.10/M
82Ministral 3 3B 2512$0.10/M
83Mistral Small Creative$0.10/M
84Nemotron 3 Super$0.10/M
85Reka Edge$0.10/M
86Reka Flash 3$0.10/M
87Step 3.5 Flash$0.10/M
88UI-TARS 7B $0.10/M
89Voxtral Small 24B 2507$0.10/M
90Qwen3 VL 32B Instruct$0.10/M
91Mistral 7B Instruct v0.1$0.11/M
92Qwen3 VL 8B Thinking$0.12/M
93MiniMax M2.5$0.12/M
94Qwen2.5 72B Instruct$0.12/M
95Gemma 4 31B$0.13/M
96GLM 4.5 Air$0.13/M
97Hermes 4 70B$0.13/M
98Qwen3 VL 30B A3B Instruct$0.13/M
99Qwen3 VL 30B A3B Thinking$0.13/M
100DeepSeek V3.1 Nex N1$0.14/M
101Qwen VL Plus$0.14/M
102ERNIE 4.5 VL 28B A3B$0.14/M
103Hermes 2 Pro - Llama-3 8B$0.14/M
104Hunyuan A13B Instruct$0.14/M
105Qwen3 235B A22B Thinking 2507$0.15/M
106DeepSeek V3.1$0.15/M
107Llama 4 Maverick$0.15/M
108Ministral 3 8B 2512$0.15/M
109Mistral Small 4$0.15/M
110Olmo 3 32B Think$0.15/M
111Olmo 3.1 32B Think$0.15/M
112Qwen3 Coder Next$0.15/M
113QwQ 32B$0.15/M
114Rnj 1 Instruct$0.15/M
115Qwen3.5-35B-A3B$0.16/M
116Rocinante 12B$0.17/M
117Llama Guard 4 12B$0.18/M
118Qwen3 Coder Flash$0.20/M
119Qwen3.5-27B$0.20/M
120DeepSeek V3 0324$0.20/M
121INTELLECT-3$0.20/M
122LongCat Flash Chat$0.20/M
123MiniMax-01$0.20/M
124Ministral 3 14B 2512$0.20/M
125Nemotron Nano 12B 2 VL$0.20/M
126Olmo 3.1 32B Instruct$0.20/M
127Qwen2.5 VL 32B Instruct$0.20/M
128Qwen3 VL 235B A22B Instruct$0.20/M
129Saba$0.20/M
130DeepSeek V3.1 Terminus$0.21/M
131Qwen3 Coder 480B A35B$0.22/M
132Trinity Large Thinking$0.22/M
133Llama 3.2 11B Vision Instruct$0.24/M
134MiniMax M2$0.26/M
135DeepSeek V3.2$0.26/M
136Qwen Plus 0728$0.26/M
137Qwen Plus 0728 (thinking)$0.26/M
138Qwen-Plus$0.26/M
139Qwen3 VL 235B A22B Thinking$0.26/M
140Qwen3.5 Plus 2026-02-15$0.26/M
141Qwen3.5-122B-A10B$0.26/M
142DeepSeek V3.2 Exp$0.27/M
143ERNIE 4.5 300B A47B $0.28/M
144MiniMax M2.1$0.29/M
145R1 Distill Qwen 32B$0.29/M
146Codestral 2508$0.30/M
147Cydonia 24B V4.1$0.30/M
148DeepSeek R1T2 Chimera$0.30/M
149GLM 4.6V$0.30/M
150Hermes 3 70B Instruct$0.30/M
151MiniMax M2.7$0.30/M
152DeepSeek V3$0.32/M
153Qwen3.6 Plus$0.33/M
154Mistral Small 3.1 24B$0.35/M
155Kimi K2.5$0.38/M
156GLM 4.6$0.39/M
157GLM 4.7$0.39/M
158Qwen3.5 397B A17B$0.39/M
159DeepSeek V3.2 Speciale$0.40/M
160Devstral 2 2512$0.40/M
161Devstral Medium$0.40/M
162Kimi K2 0905$0.40/M
163Llama 3.1 70B Instruct$0.40/M
164Mistral Medium 3$0.40/M
165Mistral Medium 3.1$0.40/M
166UnslopNemo 12B$0.40/M
167ERNIE 4.5 VL 424B A47B $0.42/M
168ReMM SLERP 13B$0.45/M
169Qwen3 235B A22B$0.46/M
170Llama Guard 3 8B$0.48/M
171Mistral Large 3 2512$0.50/M
172R1 0528$0.50/M
173Llama 3 70B Instruct$0.51/M
174Qwen VL Max$0.52/M
175Mixtral 8x7B Instruct$0.54/M
176Skyfall 36B V2$0.55/M
177Kimi K2 0711$0.57/M
178GLM 4.5$0.60/M
179GLM 4.5V$0.60/M
180Kimi K2 Thinking$0.60/M
181Llama 3.1 Nemotron Ultra 253B v1$0.60/M
182WizardLM-2 8x22B$0.62/M
183Gemma 2 27B$0.65/M
184Llama 3.3 Euryale 70B$0.65/M
185Qwen3 Coder Plus$0.65/M
186Qwen2.5 Coder 32B Instruct$0.66/M
187Aion-1.0-Mini$0.70/M
188R1$0.70/M
189R1 Distill Llama 70B$0.70/M
190GLM 5$0.72/M
191Qwen3 Max$0.78/M
192Qwen3 Max Thinking$0.78/M
193CodeLLaMa 7B Instruct Solidity$0.80/M
194Llemma 7b$0.80/M
195Qwen2.5 VL 72B Instruct$0.80/M
196Llama 3.1 Euryale 70B v2.2$0.85/M
197GLM 5.1$0.95/M
198Hermes 3 405B Instruct$1.00/M
199Hermes 4 405B$1.00/M
200Qwen-Max $1.04/M
201Llama 3.1 Nemotron 70B Instruct$1.20/M
202Llama 3 Euryale 70B v2.1$1.48/M
203Jamba Large 1.7$2.00/M
204Mistral Large$2.00/M
205Mistral Large 2407$2.00/M
206Mistral Large 2411$2.00/M
207Mixtral 8x22B Instruct$2.00/M
208Pixtral Large 2411$2.00/M
209Command A$2.50/M
210Llama 3.1 70B Hanami x1$3.00/M
211Magnum v4 72B$3.00/M
212Goliath 120B$3.75/M
Low volume
Under 1M tokens/day

Always pick API. A single GPU hour wipes out weeks of API spend at this volume.

Mid volume
10M to 100M tokens/day

Depends on model size. Small models (under 30B) are cheaper via API. Large models (200B+) favor dedicated GPUs.

High volume
1B+ tokens/day

Self-host wins, if utilization stays near 100%. Use reserved instances and bundle across workloads.

Cheapest
Gemma 3 12B (free)
$0.00/M
$ per 1M input tokens
Why the gap

Within OSS, the price range is driven by parameter count (bigger = more expensive to serve) and provider margins. Smaller dense models and heavily quantized serves sit at the bottom.

Most expensive
Goliath 120B
$3.75/M
$ per 1M input tokens
The weights are publicly downloadable under some license (Apache 2.0, MIT, Llama Community, Qwen License, DeepSeek License, etc.). Not every "open" license is actually OSI-compliant · always read the license before commercial use.