GPT-5.4 nano is the most lightweight and cost-efficient variant of the GPT-5.4 family, optimized for speed-critical and high-volume tasks. It supports text and image inputs and is designed for low-latency...
Tested on 23 benchmarks with 33.6% average. Top scores: OTIS Mock AIME 2024-2025 (87.8%), GPQA diamond (71.3%), LiveBench — Coding (61.9%).
Hunyuan A13B Instruct scores 29.3 (100% as good) at $0.14/1M input · 30% cheaper
Regularly refreshed coding problems that avoid data contamination. New problems added monthly to prevent memorization.
Unusual and adversarial machine learning challenges. Tests robustness of reasoning about edge cases in ML systems.
LiveBench coding tasks that require multi-step reasoning and tool use. Tests planning and execution of complex coding workflows.
Abstraction and Reasoning Corpus. Tests fluid intelligence through novel visual pattern recognition puzzles. Core measure of general intelligence.
Fresh data analysis tasks testing ability to interpret tables, charts, and statistical data.
Regularly refreshed reasoning problems testing logical deduction, spatial reasoning, and analytical thinking.
Mock AIME (American Invitational Mathematics Exam) problems from OTIS. Tests mathematical competition performance.
Regularly updated math problems that test numerical reasoning, algebra, calculus, and combinatorics.
- Typemultimodal
- Context400K tokens (~200 books)
- ReleasedMar 2026
- LicenseProprietary
- StatusActive
- Cost / Message~$0.002