GPT-5.4 mini brings the core capabilities of GPT-5.4 to a faster, more efficient model optimized for high-throughput workloads. It supports text and image inputs with strong performance across reasoning, coding,...
Tested on 23 benchmarks with 38.3% average. Top scores: OTIS Mock AIME 2024-2025 (87.2%), GPQA diamond (78.1%), LiveBench — Coding (74.7%).
GLM 4 32B scores 37.8 (101% as good) at $0.10/1M input · 87% cheaper
Regularly refreshed coding problems that avoid data contamination. New problems added monthly to prevent memorization.
Unusual and adversarial machine learning challenges. Tests robustness of reasoning about edge cases in ML systems.
LiveBench coding tasks that require multi-step reasoning and tool use. Tests planning and execution of complex coding workflows.
Abstraction and Reasoning Corpus. Tests fluid intelligence through novel visual pattern recognition puzzles. Core measure of general intelligence.
Fresh data analysis tasks testing ability to interpret tables, charts, and statistical data.
Regularly refreshed reasoning problems testing logical deduction, spatial reasoning, and analytical thinking.
Mock AIME (American Invitational Mathematics Exam) problems from OTIS. Tests mathematical competition performance.
Regularly updated math problems that test numerical reasoning, algebra, calculus, and combinatorics.
- Typemultimodal
- Context400K tokens (~200 books)
- ReleasedMar 2026
- LicenseProprietary
- StatusActive
- Cost / Message~$0.006