How much does GPT-5 cost?

GPT-5 costs $1.25 per million input tokens and $10.00 per million output tokens. For a typical conversation (~2,000 tokens), that's approximately $0.013 per message.

What benchmarks has GPT-5 been tested on?

GPT-5 has been evaluated on 26 benchmarks. Top scores: MATH level 5: 98.1, Fiction.LiveBench: 97.2, OTIS Mock AIME 2024-2025: 91.4.

Is GPT-5 open source?

No, GPT-5 is a proprietary model by OpenAI.

How does GPT-5 compare to Qwen2.5 72B Instruct?

GPT-5 has an average score of 65.7 while Qwen2.5 72B Instruct scores 65.8. Qwen2.5 72B Instruct slightly outperforms GPT-5 overall. GPT-5 costs $1.25/1M input vs Qwen2.5 72B Instruct at $0.36/1M input. See full comparison →

Home/Models/GPT-5

GPT-5

Name: GPT-5
Price: 1.25 USD
Author: OpenAI

by OpenAI · Released Aug 2025

Multimodal

65.7

avg score

Rank #52

Compare

Better than 78% of all models

Context

400K tokens (~200 books)

Input $/1M

$1.25

Output $/1M

$10.00

Type

multimodal

License

Proprietary

Benchmarks

26 tested

Data updated today

About

GPT-5 is OpenAI’s most advanced model, offering major improvements in reasoning, code quality, and user experience. It is optimized for complex tasks that require step-by-step reasoning, instruction following, and accuracy...

Tested on 26 benchmarks with 54.4% average. Top scores: MATH level 5 (98.1%), Fiction.LiveBench (97.2%), OTIS Mock AIME 2024-2025 (91.4%).

Looking for similar performance at lower cost?
Qwen2.5 72B Instruct scores 65.8 (100% as good) at $0.36/1M input · 71% cheaper

Capabilities

coding

57.3

#48 globally

reasoning

41.2

#64 globally

math

58.6

#51 globally

knowledge

57.8

#61 globally

agentic

18.3

#23 globally

Benchmark Scores

Compare All

Tested on 26 benchmarks · Ranked across 5 categories

Score Distribution (all 233 models)

0255075100

▲ You are here

codingCompare coding →

Aider polyglot

Multi-language code editing from Aider. Tests editing ability across Python, JavaScript, TypeScript, Java, C++, Go, Rust, and more.

88.0—

SWE-Bench verified

Real-world software engineering tasks from GitHub issues. Models must diagnose bugs and write patches that pass test suites. Human-verified subset of SWE-bench.

73.5—

SWE-Bench Verified (Bash Only)

SWE-bench Verified solved using only bash commands, no specialized frameworks. Tests raw terminal-based problem solving.

65.0—

reasoningCompare reasoning →

ARC-AGI

Abstraction and Reasoning Corpus. Tests fluid intelligence through novel visual pattern recognition puzzles. Core measure of general intelligence.

65.7—

SimpleBench

Deceptively simple questions that humans find easy but AI models often get wrong. Tests common sense and reasoning gaps.

48.0—

ARC-AGI-2

ARC-AGI 2, harder sequel to ARC. More complex abstract reasoning patterns that test generalization ability beyond training data.

9.9—

mathCompare math →

MATH level 5

Competition-level math from AMC, AIME, and olympiad problems. Level 5 is the hardest tier, requiring creative problem-solving.

98.1—

OTIS Mock AIME 2024-2025

Mock AIME (American Invitational Mathematics Exam) problems from OTIS. Tests mathematical competition performance.

91.4—

FrontierMath-2025-02-28-Private

Original research-level math problems created by professional mathematicians. Problems are unpublished and cannot be memorized.

32.4—

Quick compare:

vs Qwen2.5 72B Instruct

vs GPT-5.5

vs Meta Llama 3 8B

Excellent (85+) Good (70-85) Average (50-70) Below (<50)

Model Family · OpenAI GPT-5

GPT-5Aug 2025

54.4

$1.25/M in400Kctx26 benchmarks

GPT-5 ChatAug 2025

81.9+27.5

$1.25/M in128Kctx(-272K)7 benchmarks

GPT-5 CodexSep 2025

$1.25/M in400Kctx(+272K)

GPT-5 ImageOct 2025

$10.00/M in(+8.75)400Kctx

GPT-5 Image MiniOct 2025

$2.50/M in(-7.50)400Kctx

GPT-5 MiniAug 2025

56.0+56.0

$0.25/M in(-2.25)400Kctx28 benchmarks

GPT-5 NanoAug 2025

45.3-10.7

$0.05/M in(-0.20)400Kctx26 benchmarks

GPT-5 ProOct 2025

43.3-2.0

$15.00/M in(+14.95)400Kctx8 benchmarks

See the full GPT-5 family →

Similar Models

Frequently Asked Questions

GPT-5 is a proprietary multimodal AI model by OpenAI, released in August 2025. It has an average benchmark score of 65.7. Context window: 400K tokens.

Benchmarks

MATH level 5 Fiction.LiveBench OTIS Mock AIME 2024-2025 Aider polyglot Lech Mazur Writing

OpenAI · Provider OpenAI · Economy All Models Compare Models Pricing Developers · API Context Window · Glossary

GPT-5

Frequently Asked Questions

Related Models

Benchmarks

Related Pages