GPT-4o ("o" for "omni") is OpenAI's latest AI model, supporting both text and image inputs with text outputs. It maintains the intelligence level of [GPT-4 Turbo](/models/openai/gpt-4-turbo) while being twice as...
Tested on 8 benchmarks with 51.1% average. Top scores: Chatbot Arena Elo — Overall (1345.1%), ScienceQA (84.7%), MMLU (78.9%).
gpt-oss-20b (free) scores 61.0 (100% as good) at $0.00/1M input · 100% cheaper
Code editing benchmark from the Aider project. Measures ability to apply targeted code changes while maintaining correctness and style.
Competition-level math from AMC, AIME, and olympiad problems. Level 5 is the hardest tier, requiring creative problem-solving.
Mock AIME (American Invitational Mathematics Exam) problems from OTIS. Tests mathematical competition performance.
Science questions with multimodal context including diagrams and charts from K-12 curriculum.
Massive Multitask Language Understanding. 57 subjects from STEM, humanities, and social sciences. The most widely-cited knowledge benchmark.
Broad Assessment of Language and Reasoning Over Games. Tests strategic and logical reasoning through game scenarios.
- Typemultimodal
- Context128K tokens (~64 books)
- ReleasedMay 2024
- LicenseProprietary
- StatusActive
- Cost / Message~$0.025