Home/Models/GPT-5.4
OpenAI logo

GPT-5.4

by OpenAI · Released Mar 2026

Multimodal1M Context
83.4
avg score
Rank #17
Compare
Better than 93% of all models
Context
1.1M tokens (~525 books)
Input $/1M
$2.50
Output $/1M
$15.00
Type
multimodal
License
Proprietary
Benchmarks
16 tested
Data updated today
About

GPT-5.4 is OpenAI’s latest frontier model, unifying the Codex and GPT lines into a single system. It features a 1M+ token context window (922K input, 128K output) with support for...

Tested on 16 benchmarks with 59.0% average. Top scores: Chatbot Arena Elo — Overall (1465.8%), OTIS Mock AIME 2024-2025 (95.3%), ARC-AGI (93.7%).

Looking for similar performance at lower cost?
MiMo-V2-Flash scores 81.7 (98% as good) at $0.09/1M input · 96% cheaper
Capabilities
coding
67.2
#22 globally
reasoning
83.8
#4 globally
math
56.7
#56 globally
knowledge
50.0
#101 globally
agentic
35.9
#11 globally
speed
102.1
#1 globally
Benchmark Scores
Compare All
Tested on 16 benchmarks · Ranked across 7 categories
Score Distribution (all 233 models)
0255075100
▲ You are here
SWE-Bench verified

Real-world software engineering tasks from GitHub issues. Models must diagnose bugs and write patches that pass test suites. Human-verified subset of SWE-bench.

76.9
WeirdML

Unusual and adversarial machine learning challenges. Tests robustness of reasoning about edge cases in ML systems.

57.4
ARC-AGI

Abstraction and Reasoning Corpus. Tests fluid intelligence through novel visual pattern recognition puzzles. Core measure of general intelligence.

93.7
ARC-AGI-2

ARC-AGI 2, harder sequel to ARC. More complex abstract reasoning patterns that test generalization ability beyond training data.

74.0
OTIS Mock AIME 2024-2025

Mock AIME (American Invitational Mathematics Exam) problems from OTIS. Tests mathematical competition performance.

95.3
FrontierMath-2025-02-28-Private

Original research-level math problems created by professional mathematicians. Problems are unpublished and cannot be memorized.

47.6
FrontierMath-Tier-4-2025-07-01-Private

Hardest tier of FrontierMath. Problems at the frontier of human mathematical ability, many unsolved by most mathematicians.

27.1
Excellent (85+) Good (70-85) Average (50-70) Below (<50)
Recently Happened
GPT-5.4 standard tier released by OpenAI
Mar 14, 2026
Links
Documentation
Community
BenchGecko API
gpt-5-4
Specifications
  • Typemultimodal
  • Context1.1M tokens (~525 books)
  • ReleasedMar 2026
  • LicenseProprietary
  • StatusActive
  • Cost / Message~$0.020
Available On
OpenAI logoOpenAI$2.50
Share & Export
Tweet
GPT-5.4 is a proprietary multimodal AI model by OpenAI, released in March 2026. It has an average benchmark score of 83.4. Context window: 1M tokens.