GPT-5.2 Pro is OpenAI’s most advanced model, offering major improvements in agentic coding and long context performance over GPT-5 Pro. It is optimized for complex tasks that require step-by-step reasoning,...
Tested on 4 benchmarks with 56.2% average. Top scores: ARC-AGI (90.5%), ARC-AGI-2 (54.2%), SimpleBench (48.9%).
Llama 3.3 70B Instruct scores 75.9 (100% as good) at $0.10/1M input · 100% cheaper
Abstraction and Reasoning Corpus. Tests fluid intelligence through novel visual pattern recognition puzzles. Core measure of general intelligence.
ARC-AGI 2, harder sequel to ARC. More complex abstract reasoning patterns that test generalization ability beyond training data.
Deceptively simple questions that humans find easy but AI models often get wrong. Tests common sense and reasoning gaps.
Hardest tier of FrontierMath. Problems at the frontier of human mathematical ability, many unsolved by most mathematicians.
- Typemultimodal
- Context400K tokens (~200 books)
- ReleasedDec 2025
- LicenseProprietary
- StatusActive
- Cost / Message~$0.210