Models · LeaderboardUpdated 8d ago · 994 models · 267 providers · 128 benchmarks

Every AI Model · Tracked

Name: AI Models Directory
Creator: BenchGecko
License: https://creativecommons.org/licenses/by/4.0/

The most complete list of AI models you can actually use · 994 models, 267 providers, 128 benchmarks · all scored, priced, and ranked in one place.

Benchmark scoredCross-provider pricingContext · parameters · licensingRefreshed daily

Compare side by sideShowing 47 of 994

Total Models

994

Providers

267

Top 10 Overall

Ranked by average benchmark score · min 3 benchmarks

#2 SILVER6 benchmarks

GPT-5.5

OpenAI

85.0%

#3 BRONZE7 benchmarks

GPT-5 Chat

OpenAI

81.9%

Claude Mythos Preview

Anthropic

81.8%14 benchmarks #5

Qwen3.5 397B A17B

Alibaba Qwen

78.4%11 benchmarks #6

DeepSeek V3.2 Speciale

76.9%10 benchmarks #9

DeepSeek-V2 (MoE-236B, May 2024)

DeepSeek

76.5%7 benchmarks #10

MiMo-V2-Flash

xiaomi

73.3%11 benchmarks

What's moving

New releases, coverage, price leaders, and ELO champions

#	Model	Provider	Category	Released
1	Granite 4.1 8B	ibm-granite	LLM	Apr 30, 2026
2	Grok 4.3	xAI	Multimodal	Apr 30, 2026
3	Nemotron 3 Nano Omni (free)	NVIDIA	Multimodal	Apr 28, 2026
4	Owl Alpha	openrouter	LLM	Apr 28, 2026
5	Qwen3.5 Plus 2026-04-20	Alibaba Qwen	Multimodal	Apr 27, 2026
6	Qwen3.6 27B	Alibaba Qwen	Multimodal	Apr 27, 2026
7	Qwen3.6 35B A3B	Alibaba Qwen	Multimodal	Apr 27, 2026
8	Qwen3.6 Flash	Alibaba Qwen	Multimodal	Apr 27, 2026
9	Qwen3.6 Max Preview	Alibaba Qwen	LLM	Apr 27, 2026
10	DeepSeek V4 Flash	DeepSeek	LLM	Apr 24, 2026

Provider

Filter

47 results

All Models

#	Model	Provider	Category	Context	In $/M	Out $/M	Avg	Benchmarks	ELO
1	Llama 3.3 70B InstructOSS	Meta	LLM	131K	$0.10	$0.32	46.9%	8	—
2	Meta Llama 3 8B InstructOSS	Meta	LLM	—	TBD	TBD	45.2%	25	—
3	Meta Llama 3 8BOSS	Meta	LLM	—	TBD	TBD	44.2%	11	—
4	Llama 2-13BOSS	Meta	LLM	—	TBD	TBD	42.5%	14	—
5	Llama 3.1 405BOSS	Meta	LLM	—	TBD	TBD	38.0%	21	—
6	Llama 3.1 70B InstructOSS	Meta	LLM	131K	$0.40	$0.40	37.8%	16	—
7	Llama 3.2 90BOSS	Meta	LLM	—	TBD	TBD	36.1%	6	—
8	LLaMA-13BOSS	Meta	LLM	—	TBD	TBD	34.9%	20	—
9	Llama 3 70B InstructOSS	Meta	LLM	8K	$0.51	$0.74	32.4%	9	—
10	Llama 3 8B InstructOSS	Meta	LLM	8K	$0.03	$0.04	30.8%	16	—
11	Llama 3.3 70B Instruct (free)OSS	Meta	LLM	66K	Free	Free	29.1%	8	—
12	Llama 4 MaverickOSS	Meta	Multimodal	1.0M	$0.15	$0.60	28.0%	17	—
13	Llama 3.1 8B InstructOSS	Meta	LLM	16K	$0.02	$0.05	27.4%	16	—
14	Llama 2 7b Chat HfOSS	Meta	LLM	—	TBD	TBD	27.3%	11	—
15	Llama 3.2 3B InstructOSS	Meta	LLM	80K	$0.05	$0.34	24.2%	7	—
16	Llama 2 7b HfOSS	Meta	LLM	—	TBD	TBD	23.6%	11	—
17	Llama 4 ScoutOSS	Meta	Multimodal	328K	$0.08	$0.30	18.9%	11	—
18	Llama 3.2 1B InstructOSS	Meta	LLM	60K	$0.03	$0.20	14.5%	7	—
19	Llama 3.2 3B Instruct (free)OSS	Meta	LLM	131K	Free	Free	8.7%	6	—
20	Hf Seamless M4t MediumOSS	Meta	LLM	—	TBD	TBD	—	0	—
21	Hubert Large Ls960 FtOSS	Meta	LLM	—	TBD	TBD	—	0	—
22	Llama 3.2 11B Vision InstructOSS	Meta	Multimodal	131K	$0.24	$0.24	—	0	—
23	Llama 4 Scout 17B 16E Instruct	Meta	LLM	—	TBD	TBD	—	1	—
24	Llama Guard 3 8BOSS	Meta	LLM	131K	$0.48	$0.03	—	0	—
25	Llama Guard 4 12BOSS	Meta	Multimodal	164K	$0.18	$0.18	—	0	—
26	Llama Guard 4 12B (free)	Meta	Multimodal	164K	Free	Free	—	0	—
27	Mms 1b AllOSS	Meta	LLM	—	TBD	TBD	—	0	—
28	Mms Tts EngOSS	Meta	LLM	—	TBD	TBD	—	0	—
29	Mms Tts HatOSS	Meta	LLM	—	TBD	TBD	—	0	—
30	Mms Tts HinOSS	Meta	LLM	—	TBD	TBD	—	0	—
31	Mms Tts KikOSS	Meta	LLM	—	TBD	TBD	—	0	—
32	Mms Tts KinOSS	Meta	LLM	—	TBD	TBD	—	0	—
33	Mms Tts KorOSS	Meta	LLM	—	TBD	TBD	—	0	—
34	Mms Tts OrmOSS	Meta	LLM	—	TBD	TBD	—	0	—
35	Mms Tts RusOSS	Meta	LLM	—	TBD	TBD	—	0	—
36	Mms Tts SwhOSS	Meta	LLM	—	TBD	TBD	—	0	—
37	Nougat BaseOSS	Meta	LLM	—	TBD	TBD	—	0	—
38	Opt 125m	Meta	LLM	—	TBD	TBD	—	0	—
39	Roberta BaseOSS	Meta	LLM	—	TBD	TBD	—	0	—
40	Roberta LargeOSS	Meta	LLM	—	TBD	TBD	—	0	—
41	S2t Small Librispeech AsrOSS	Meta	LLM	—	TBD	TBD	—	0	—
42	Seamless M4t V2 LargeOSS	Meta	LLM	—	TBD	TBD	—	0	—
43	Wav2vec2 Base 960hOSS	Meta	LLM	—	TBD	TBD	—	0	—
44	Wav2vec2 Conformer Rope Large 960h FtOSS	Meta	LLM	—	TBD	TBD	—	0	—
45	Wav2vec2 Xlsr 53 Espeak Cv FtOSS	Meta	LLM	—	TBD	TBD	—	0	—
46	Xlm Roberta BaseOSS	Meta	LLM	—	TBD	TBD	—	0	—
47	Xlm Roberta LargeOSS	Meta	LLM	—	TBD	TBD	—	0	—

47 rows · click column headers to sort · pick up to 4 models to compare

Top 5 by category

Specialist leaders across every modality we track

LLMs

See all →

GPT-5.5 Pro87.8%#2

GPT-5.585.0%#3

Claude Mythos Preview81.8%#4

DeepSeek V3.2 Speciale78.2%#5

Claude Instant78.0%

Multimodal

See all →

GPT-5 Chat81.9%#2

Qwen3.5 397B A17B78.4%#3

Gemini 2.5 Pro Preview 05-0676.9%#4

GPT-5.1-Codex-Max72.0%#5

o4 Mini High72.0%

Decision shortcuts

Jump from the leaderboard into comparison, provider, pricing, and benchmark paths

GPT-5.5 Pro vs Qwen3.5

Frontier comparison

Qwen3.5 vs DeepSeek V3.2

Open model comparison

OpenAI model lineup

Provider profile

Anthropic Claude lineup

Provider profile

Coding model pricing

Cost by use case

Reasoning model pricing

Cost by use case

SWE-bench Verified results

Coding benchmark

GPQA Diamond results

Reasoning benchmark

Frequently asked

Quick answers, sourced from our data

How many AI models does BenchGecko track?

BenchGecko currently tracks 994 AI models across 267 providers, each scored against up to 128 benchmarks. New models are added continuously and the full dataset refreshes daily.

What is the best AI model right now?

"Best" depends on the task. For general reasoning we rank by average score across 3+ benchmarks; for coding we surface ELO and SWE-bench specifically; for cost/performance we expose a "cheapest capable" metric. Use the filter bar and column sort to define your own winner, or pick a category from the mini tables below.

How is the average score calculated?

Average score is the arithmetic mean of a model's normalized benchmark scores, computed only when the model has at least one public benchmark result. Models with fewer than 3 benchmarks are excluded from the podium to avoid single-score outliers.

Where does BenchGecko get this data?

Model metadata and pricing come from OpenRouter's public API. Benchmarks are pulled from Epoch AI (CC-BY) and SWE-bench's public leaderboards. ELO ratings come from LMArena. Everything is re-normalized and cross-linked daily. See the methodology page for full provenance.

Can I use this data in my article or product?

Yes. All BenchGecko data is licensed CC BY 4.0 — attribution required. Use the "Cite this page" button above for ready-made APA, MLA, AP Style, BibTeX, and HTML embed snippets. The free API tier requires a backlink to benchgecko.ai.

How often does the data refresh?

Core model, pricing and benchmark data refreshes every 24 hours. Live status and pricing alerts can fire more frequently when upstream sources change. The "Live" pill on this page lights up when the last refresh was less than an hour ago.

Which models are open source?

Toggle the "Open Source" filter to see only models with OSS weights available. We currently count permissively — weights-available licenses like Llama, Qwen, DeepSeek, and Mistral Open count as OSS for this filter even when the weights come with usage restrictions.

How do I compare two models side by side?

Check the boxes next to any 2-4 models in the leaderboard above, then click "Compare →" in the pink action bar. You can also navigate directly to /compare/[modelA]-vs-[modelB] for shareable comparisons.

Every AI Model · Tracked

Top 10 Overall

What's moving

All Models

Top 5 by category

LLMs

Multimodal

Decision shortcuts

Frequently asked

See also

Every AI Model · Tracked

Top 10 Overall

What's moving

All Models

Top 5 by category

LLMs

Multimodal

Decision shortcuts

Frequently asked

See also