Models · LeaderboardUpdated 51d ago · 1,057 models · 273 providers · 128 benchmarks

Every AI Model · Tracked

Name: AI Models Directory
Creator: BenchGecko
License: https://creativecommons.org/licenses/by/4.0/

The most complete list of AI models you can actually use · 1,057 models, 273 providers, 128 benchmarks · all scored, priced, and ranked in one place.

Benchmark scoredCross-provider pricingContext · parameters · licensingRefreshed daily

Compare side by sideShowing 1 of 1,057

Total Models

1,057

Providers

273

Top 10 Overall

Ranked by average benchmark score · min 3 benchmarks

#2 SILVER6 benchmarks

GPT-5.5

OpenAI

85.0%

#3 BRONZE7 benchmarks

GPT-5 Chat

OpenAI

81.9%

Claude Mythos Preview

Anthropic

81.8%14 benchmarks #5

Qwen3.5 397B A17B

Alibaba Qwen

78.4%11 benchmarks #6

DeepSeek V3.2 Speciale

76.9%10 benchmarks #9

DeepSeek-V2 (MoE-236B, May 2024)

DeepSeek

76.5%7 benchmarks #10

GLM 5.2

z-ai

76.2%13 benchmarks

What's moving

New releases, coverage, price leaders, and ELO champions

#	Model	Provider	Category	Released
1	Nano Banana 2 (Gemini 3.1 Flash Image)	Google DeepMind	Image Generation	Jun 18, 2026
2	Nano Banana Pro (Gemini 3 Pro Image)	Google DeepMind	Image Generation	Jun 18, 2026
3	North Mini Code (free)	Cohere	LLM	Jun 17, 2026
4	GLM 5.2	z-ai	LLM	Jun 16, 2026
5	Fusion	openrouter	LLM	Jun 13, 2026
6	Kimi K2.7 Code	moonshotai	Multimodal	Jun 12, 2026
7	~ Claude Fable Latest	~anthropic	Multimodal	Jun 9, 2026
8	Claude Fable 5	Anthropic	Multimodal	Jun 9, 2026
9	Nex-N2-Pro (free)	nex-agi	Multimodal	Jun 8, 2026
10	Nex-N2-Pro	nex-agi	Multimodal	Jun 8, 2026

Provider

Filter

1 result

All Models

	#	Model	Provider	Category	Context	In $/M	Out $/M	Avg	Benchmarks	ELO
	1	HA Qwen2.5 72B Instruct Abliterated	HuiHui AI	LLM	—	TBD	TBD	48.1%	6	—

1 rows · click column headers to sort · pick up to 4 models to compare

Top 5 by category

Specialist leaders across every modality we track

LLMs

See all →

GPT-5.5 Pro87.8%#2

GPT-5.585.0%#3

Claude Mythos Preview81.8%#4

DeepSeek V3.2 Speciale78.2%#5

Claude Instant78.0%

Multimodal

See all →

GPT-5 Chat81.9%#2

Qwen3.5 397B A17B78.4%#3

Gemini 2.5 Pro Preview 05-0676.9%#4

Claude Fable 574.4%#5

GPT-5.1-Codex-Max72.0%

Decision shortcuts

Jump from the leaderboard into comparison, provider, pricing, and benchmark paths

GPT-5.5 Pro vs Qwen3.5

Frontier comparison

Qwen3.5 vs DeepSeek V3.2

Open model comparison

OpenAI model lineup

Provider profile

Anthropic Claude lineup

Provider profile

Coding model pricing

Cost by use case

Reasoning model pricing

Cost by use case

SWE-bench Verified results

Coding benchmark

GPQA Diamond results

Reasoning benchmark

Frequently asked

Quick answers, sourced from our data

How many AI models does BenchGecko track?

BenchGecko currently tracks 1,057 AI models across 273 providers, each scored against up to 128 benchmarks. New models are added continuously and the full dataset refreshes daily.

What is the best AI model right now?

"Best" depends on the task. For general reasoning we rank by average score across 3+ benchmarks; for coding we surface ELO and SWE-bench specifically; for cost/performance we expose a "cheapest capable" metric. Use the filter bar and column sort to define your own winner, or pick a category from the mini tables below.

How is the average score calculated?

Average score is the arithmetic mean of a model's normalized benchmark scores, computed only when the model has at least one public benchmark result. Models with fewer than 3 benchmarks are excluded from the podium to avoid single-score outliers.

Where does BenchGecko get this data?

Model metadata and pricing come from OpenRouter's public API. Benchmarks are pulled from Epoch AI (CC-BY) and SWE-bench's public leaderboards. ELO ratings come from LMArena. Everything is re-normalized and cross-linked daily. See the methodology page for full provenance.

Can I use this data in my article or product?

Yes. All BenchGecko data is licensed CC BY 4.0 — attribution required. Use the "Cite this page" button above for ready-made APA, MLA, AP Style, BibTeX, and HTML embed snippets. The free API tier requires a backlink to benchgecko.ai.

How often does the data refresh?

Core model, pricing and benchmark data refreshes every 24 hours. Live status and pricing alerts can fire more frequently when upstream sources change. The "Live" pill on this page lights up when the last refresh was less than an hour ago.

Which models are open source?

Toggle the "Open Source" filter to see only models with OSS weights available. We currently count permissively — weights-available licenses like Llama, Qwen, DeepSeek, and Mistral Open count as OSS for this filter even when the weights come with usage restrictions.

How do I compare two models side by side?

Check the boxes next to any 2-4 models in the leaderboard above, then click "Compare →" in the pink action bar. You can also navigate directly to /compare/[modelA]-vs-[modelB] for shareable comparisons.

Every AI Model · Tracked

Top 10 Overall

What's moving

All Models

Top 5 by category

LLMs

Multimodal

Decision shortcuts

Frequently asked

See also

Every AI Model · Tracked

Top 10 Overall

What's moving

All Models

Top 5 by category

LLMs

Multimodal

Decision shortcuts

Frequently asked

See also