How much does DeepSeek V3 cost?

DeepSeek V3 costs $0.32 per million input tokens and $0.89 per million output tokens. For a typical conversation (~2,000 tokens), that's approximately $0.002 per message.

What benchmarks has DeepSeek V3 been tested on?

DeepSeek V3 has been evaluated on 22 benchmarks. Top scores: Chatbot Arena Elo — Overall: 1358.2, ARC AI2: 93.7, HellaSwag: 85.2.

Is DeepSeek V3 open source?

Yes, DeepSeek V3 is open source.

How does DeepSeek V3 compare to o3?

DeepSeek V3 has an average score of 58.3 while o3 scores 58.3. o3 slightly outperforms DeepSeek V3 overall. DeepSeek V3 costs $0.32/1M input vs o3 at $2.00/1M input. See full comparison →

Home/Models/DeepSeek V3

DeepSeek V3

Name: DeepSeek V3
Price: 0.32 USD
Author: DeepSeek

by DeepSeek · Released Dec 2024

Open Source

58.3

avg score

Rank #77

Compare

Better than 67% of all models

Context

164K tokens (~82 books)

Input $/1M

$0.32

Output $/1M

$0.89

Type

text

License

Open Source

Benchmarks

22 tested

Data updated today

About

DeepSeek-V3 is the latest model from the DeepSeek team, building upon the instruction following and coding abilities of the previous versions. Pre-trained on nearly 15 trillion tokens, the reported evaluations...

Tested on 22 benchmarks with 59.0% average. Top scores: Chatbot Arena Elo — Overall (1358.2%), ARC AI2 (93.7%), HellaSwag (85.2%).

Looking for similar performance at lower cost?
Qwen3 Next 80B A3B Thinking scores 57.5 (99% as good) at $0.10/1M input · 70% cheaper

Capabilities

coding

42.2

#93 globally

reasoning

56.4

#39 globally

math

30.7

#131 globally

knowledge

70.9

#15 globally

language

83.2

#41 globally

Benchmark Scores

Compare All

Tested on 22 benchmarks · Ranked across 6 categories

Score Distribution (all 233 models)

0255075100

▲ You are here

codingCompare coding →

Aider polyglot

Multi-language code editing from Aider. Tests editing ability across Python, JavaScript, TypeScript, Java, C++, Go, Rust, and more.

48.4—

WeirdML

Unusual and adversarial machine learning challenges. Tests robustness of reasoning about edge cases in ML systems.

36.1—

reasoningCompare reasoning →

BBH

BIG-Bench Hard. 23 challenging tasks from BIG-Bench where prior language models fell below average human performance.

83.3—

HELM — WildBench

Stanford HELM WildBench evaluation. Tests reasoning on challenging real-world tasks.

83.1—

SimpleBench

Deceptively simple questions that humans find easy but AI models often get wrong. Tests common sense and reasoning gaps.

2.7—

mathCompare math →

MATH level 5

Competition-level math from AMC, AIME, and olympiad problems. Level 5 is the hardest tier, requiring creative problem-solving.

64.8—

HELM — Omni-MATH

Stanford HELM evaluation of mathematical reasoning across diverse problem types.

40.3—

OTIS Mock AIME 2024-2025

Mock AIME (American Invitational Mathematics Exam) problems from OTIS. Tests mathematical competition performance.

15.8—

Quick compare:

vs o3

vs Kimi K2 0711

vs Qwen3 235B A22B

Excellent (85+) Good (70-85) Average (50-70) Below (<50)

Model Family · DeepSeek DeepSeek V3

DeepSeek V3Dec 2024

59.0

$0.32/M in164Kctx22 benchmarks

DeepSeek V3 0324Mar 2025

55.1-3.9

$0.20/M in(-0.12)164Kctx2 benchmarks

See the full DeepSeek V3 family →

Similar Models

Links

Info

DeepSeek Pricing explorer Developers · API

Research

Documentation

Community

Source Code

BenchGecko API

deepseek-chat

Specifications

Typetext
Context164K tokens (~82 books)
ReleasedDec 2024
LicenseOpen Source
StatusActive
Cost / Message~$0.002

Available On

DeepSeek$0.32

Frequently Asked Questions

DeepSeek V3 is an open-source text AI model by DeepSeek, released in December 2024. It has an average benchmark score of 58.3. Context window: 164K tokens.

Benchmarks

Chatbot Arena Elo — Overall ARC AI2 HellaSwag BBH HELM — IFEval

DeepSeek · Provider DeepSeek · Economy All Models Compare Models Pricing Developers · API Context Window · Glossary

DeepSeek V3

Frequently Asked Questions

Related Models

Benchmarks

Related Pages