Flagship AI Models

The best AI model from each provider, ranked by benchmark score. Compare the flagships from OpenAI, Anthropic, Google, Meta, and more.

Models

Providers

Open Source

$0.39

Median $/1M in

Top 3

Claude Mythos Preview

DeepSeek V3.2 Speciale

🇨🇳 DeepSeek

Open164K$0.40/M in

95.29 benchmarks

Full Rankings

#	Model	Avg	$/1M in	Context
1	Claude Mythos Preview🇺🇸 Anthropic	100.0	N/A	1.0M
2	Qwen3.5 397B A17B🇨🇳 Alibaba QwenOpen	96.3	$0.39	262K
3	DeepSeek V3.2 Speciale🇨🇳 DeepSeekOpen	95.2	$0.40	164K
4	GPT-5.4 Pro🇺🇸 OpenAI	93.0	$30.00	1.1M
5	Gemini 3.1 Pro Preview🇺🇸 Google DeepMind	90.0	$2.00	1.0M
6	Step 3.5 Flash🇨🇳 stepfunOpen	89.5	$0.10	262K
7	HA Qwen2.5 72B Instruct Abliterated HuiHui AI	87.5	N/A	0K
8	GLM 5.1🇨🇳 z-aiOpen	87.0	$0.95	203K
9	MiMo-V2-Flash🇨🇳 xiaomiOpen	81.7	$0.09	262K
10	phi-3-small 7.4B🇺🇸 MicrosoftOpen	78.8	N/A	0K
11	U Muse Spark Unknown	77.0	N/A	0K
12	Llama 3.3 70B Instruct🇺🇸 MetaOpen	75.9	$0.10	131K
13	Hermes 3 70B Instruct🇺🇸 nousresearchOpen	73.3	$0.30	131K
14	MiniMax M2🇨🇳 minimaxOpen	72.4	$0.26	197K
15	Grok 3 Beta🇺🇸 xAI	67.9	$3.00	131K
16	Mixtral 8x7B Instruct🇫🇷 Mistral AIOpen	66.8	$0.54	33K
17	Falcon 2 11B TIIOpen	63.3	N/A	0K
18	Kimi K2 Thinking🇨🇳 moonshotaiOpen	61.0	$0.60	262K
19	Palmyra X5🇺🇸 writer	57.0	$0.60	1.0M
20	D Dolphin 2.9.1 Yi 1.5 34b DPHNOpen	55.9	N/A	0K
21	LongCat Flash Chat🇨🇳 meituanOpen	53.4	$0.20	131K
22	Magnum v4 72B anthracite-orgOpen	51.2	$3.00	16K
23	ERNIE 4.5 21B A3B Thinking🇨🇳 baiduOpen	39.8	$0.07	131K
24	Hunyuan A13B Instruct🇨🇳 tencentOpen	29.3	$0.14	131K
25	L Vicuna 7b V1.5 LMSYSOpen	19.1	N/A	0K
26	Nemotron 3 Super🇺🇸 NVIDIAOpen	11.6	$0.10	262K
27	Pythia 160m🇺🇸 eleutheraiOpen	10.3	N/A	0K
28	HF SmolLM2 135M Hugging Face TBOpen	9.4	N/A	0K
29	D Distilgpt2 DistilBERTOpen	8.6	N/A	0K
30	T TinyLlama 1.1B Chat V1.0 TinyLlamaOpen	3.8	N/A	0K

About this category

The single highest-scoring model from each major provider, ranked by benchmark performance. This view cuts through the noise to show how providers compare at their best.

Related categories

Best AI Models for Coding

AI models ranked by coding benchmarks. Compare HumanEval+, SWE-bench Verified, Aider Polyglot, and more across all providers.

Best AI Models for Reasoning

AI models ranked by reasoning benchmarks. Compare GPQA Diamond, ARC-AGI, BBH, and other reasoning tests across all providers.

AI Models with 1M+ Context

AI models with 1 million+ token context windows ranked by score. Compare Gemini, Claude, and other long-context models.

Frequently asked questions

Which AI provider has the best flagship model?

Flagship rankings shift frequently as providers release updates. The leaderboard above shows the current top model from each major provider.

How are flagship models selected?

For each provider, we select the single model with the highest average benchmark score. This ensures a fair one-to-one comparison across providers.

Back to all models