LIVETracking 994 AI models from 267 providers.

Charts·Build live AI market views Open charts Build your own chart

R1

by DeepSeek · Released Jan 2025

Open Source

46.4

avg score

Rank #145

Better than 47% of all models

Context

164K tokens (~82 books)

Input $/1M

$0.70

Output $/1M

$2.50

Type

text

License

Open Source

Benchmarks

14 tested

Data updated today

About

DeepSeek R1 is here: Performance on par with [OpenAI o1](/openai/o1), but open-sourced and with fully open reasoning tokens. It's 671B parameters in size, with 37B active in an inference pass....

Tested on 14 benchmarks with 45.1% average. Top scores: Chatbot Arena Elo — Overall (1398.1%), MATH level 5 (93.0%), Lech Mazur Writing (83.0%).

Looking for similar performance at lower cost?
Gemini 2.0 Flash Lite scores 45.3 (98% as good) at $0.07/1M input · 89% cheaper

Capabilities

coding

46.7

#97 globally

reasoning

11.4

#150 globally

math

73.2

#35 globally

knowledge

52.0

#106 globally

Benchmark Scores

Tested on 14 benchmarks · Ranked across 5 categories

Score Distribution (all 274 models)

0255075100

▲ You are here

codingCompare coding →

Multi-language code editing from Aider. Tests editing ability across Python, JavaScript, TypeScript, Java, C++, Go, Rust, and more.

56.9—

Unusual and adversarial machine learning challenges. Tests robustness of reasoning about edge cases in ML systems.

36.5—

reasoningCompare reasoning →

Deceptively simple questions that humans find easy but AI models often get wrong. Tests common sense and reasoning gaps.

17.1—

Abstraction and Reasoning Corpus. Tests fluid intelligence through novel visual pattern recognition puzzles. Core measure of general intelligence.

15.8—

ARC-AGI 2, harder sequel to ARC. More complex abstract reasoning patterns that test generalization ability beyond training data.

1.3—

mathCompare math →

Competition-level math from AMC, AIME, and olympiad problems. Level 5 is the hardest tier, requiring creative problem-solving.

93.0—

OTIS Mock AIME 2024-2025

Mock AIME (American Invitational Mathematics Exam) problems from OTIS. Tests mathematical competition performance.

53.3—

Quick compare:

vs GLM 5V Turbo

vs Claude Sonnet 4.5

vs Gemini 2.0 Flash Lite

Excellent (85+) Good (70-85) Average (50-70) Below (<50)

Similar Models

Claude Sonnet 4.5

Gemini 2.0 Flash Lite

Google DeepMind

Links

Info

DeepSeek Pricing explorer Developers · API

Research

Technical Report

Documentation

API Docs Playground

Community

Source Code

GitHub Hugging Face

BenchGecko API

deepseek-r1

Specifications

Typetext
Context164K tokens (~82 books)
ReleasedJan 2025
LicenseOpen Source
StatusActive
Cost / Message~$0.004

Available On

DeepSeek$0.70

Categories

coding reasoning math knowledge

Learn More

context-window transformer open weights tokens

Share & Export

Related Models

Claude Sonnet 4.5

Gemini 2.0 Flash Lite

Frequently Asked Questions

R1 is an open-source text AI model by DeepSeek, released in January 2025. It has an average benchmark score of 46.4. Context window: 164K tokens.

Related Models

GLM 5V Turbo · z-ai Claude Sonnet 4.5 · Anthropic Gemini 2.0 Flash Lite · Google DeepMind Claude Opus 4 · Anthropic R1 Distill Llama 70B · DeepSeek

Benchmarks

Chatbot Arena Elo — Overall MATH level 5 Lech Mazur Writing Fiction.LiveBench GPQA diamond

Related Pages

DeepSeek · Provider DeepSeek · Economy All Models Compare Models Pricing Developers · API Context Window · Glossary