How much does GPT-5.1 cost?

GPT-5.1 costs $1.25 per million input tokens and $10.00 per million output tokens. For a typical conversation (~2,000 tokens), that's approximately $0.013 per message.

Is GPT-5.1 open source?

No, GPT-5.1 is a proprietary model by OpenAI.

How does GPT-5.1 compare to o4 Mini?

GPT-5.1 has an average score of 57.0 while o4 Mini scores 57.0. o4 Mini slightly outperforms GPT-5.1 overall. GPT-5.1 costs $1.25/1M input vs o4 Mini at $1.10/1M input. See full comparison →

Home/Models/GPT-5.1

GPT-5.1

Name: GPT-5.1
Price: 1.25 USD
Author: OpenAI

by OpenAI · Released Nov 2025

Multimodal

57.0

avg score

Rank #83

Compare

Better than 64% of all models

Context

400K tokens (~200 books)

Input $/1M

$1.25

Output $/1M

$10.00

Type

multimodal

License

Proprietary

Benchmarks

24 tested

Data updated today

About

GPT-5.1 is the latest frontier-grade model in the GPT-5 series, offering stronger general-purpose reasoning, improved instruction adherence, and a more natural conversational style compared to GPT-5. It uses adaptive reasoning...

Tested on 24 benchmarks with 49.6% average. Top scores: Chatbot Arena Elo — Overall (1438.5%), Chatbot Arena Elo — Coding (1338.8%), HELM — IFEval (93.5%).

Looking for similar performance at lower cost?
Qwen2.5 7B Instruct scores 57.4 (101% as good) at $0.04/1M input · 97% cheaper

Capabilities

coding

51.2

#64 globally

reasoning

55.1

#41 globally

math

44.6

#94 globally

knowledge

46.3

#117 globally

agentic

17.5

#24 globally

language

93.5

#4 globally

Benchmark Scores

Compare All

Tested on 24 benchmarks · Ranked across 7 categories

Score Distribution (all 233 models)

0255075100

▲ You are here

codingCompare coding →

SWE-Bench verified

Real-world software engineering tasks from GitHub issues. Models must diagnose bugs and write patches that pass test suites. Human-verified subset of SWE-bench.

68.0—

SWE-Bench Verified (Bash Only)

SWE-bench Verified solved using only bash commands, no specialized frameworks. Tests raw terminal-based problem solving.

66.0—

WeirdML

Unusual and adversarial machine learning challenges. Tests robustness of reasoning about edge cases in ML systems.

60.8—

reasoningCompare reasoning →

HELM — WildBench

Stanford HELM WildBench evaluation. Tests reasoning on challenging real-world tasks.

86.3—

ARC-AGI

Abstraction and Reasoning Corpus. Tests fluid intelligence through novel visual pattern recognition puzzles. Core measure of general intelligence.

72.8—

SimpleBench

Deceptively simple questions that humans find easy but AI models often get wrong. Tests common sense and reasoning gaps.

43.8—

mathCompare math →

OTIS Mock AIME 2024-2025

Mock AIME (American Invitational Mathematics Exam) problems from OTIS. Tests mathematical competition performance.

88.6—

HELM — Omni-MATH

Stanford HELM evaluation of mathematical reasoning across diverse problem types.

46.4—

FrontierMath-2025-02-28-Private

Original research-level math problems created by professional mathematicians. Problems are unpublished and cannot be memorized.

31.0—

Quick compare:

vs o4 Mini

vs Palmyra X5

vs Qwen2.5 Coder 14B Instruct

Excellent (85+) Good (70-85) Average (50-70) Below (<50)

Model Family · OpenAI GPT-5.1

GPT-5.1Nov 2025

49.6

$1.25/M in400Kctx24 benchmarks

GPT-5.1 ChatNov 2025

$1.25/M in128Kctx(-272K)

GPT-5.1-CodexNov 2025

68.6+68.6

$1.25/M in400Kctx(+272K)8 benchmarks

GPT-5.1-Codex-MaxDec 2025

72.0+3.4

$1.25/M in400Kctx8 benchmarks

GPT-5.1-Codex-MiniNov 2025

60.4-11.6

$0.25/M in(-1)400Kctx8 benchmarks

See the full GPT-5.1 family →

Similar Models

Qwen2.5 Coder 14B Instruct

Alibaba

57.0TBD

Links

Info

OpenAI Pricing explorer Developers · API

Research

Technical Report

Documentation

API Docs Playground

Community

@OpenAI

BenchGecko API

gpt-5-1

Specifications

Typemultimodal
Context400K tokens (~200 books)
ReleasedNov 2025
LicenseProprietary
StatusActive
Cost / Message~$0.013

Available On

OpenAI$1.25

Frequently Asked Questions

GPT-5.1 is a proprietary multimodal AI model by OpenAI, released in November 2025. It has an average benchmark score of 57.0. Context window: 400K tokens.

Benchmarks

Chatbot Arena Elo — Overall Chatbot Arena Elo — Coding HELM — IFEval OTIS Mock AIME 2024-2025 HELM — WildBench

OpenAI · Provider OpenAI · Economy All Models Compare Models Pricing Developers · API Context Window · Glossary

GPT-5.1

Frequently Asked Questions

Related Models

Benchmarks

Related Pages