Home/Models/GPT-4o-mini (2024-07-18)
OpenAI logo

GPT-4o-mini (2024-07-18)

by OpenAI · Released Jul 2024

Multimodal
33.4
avg score
Rank #168
Compare
Better than 28% of all models
Context
128K tokens (~64 books)
Input $/1M
$0.15
Output $/1M
$0.60
Type
multimodal
License
Proprietary
Benchmarks
20 tested
Data updated today
About

GPT-4o mini is OpenAI's newest model after [GPT-4 Omni](/models/openai/gpt-4o), supporting both text and image inputs with text outputs. As their most advanced small model, it is many multiples more affordable...

Tested on 20 benchmarks with 43.2% average. Top scores: Chatbot Arena Elo — Overall (1317.2%), GSM8K (91.3%), HELM — WildBench (79.1%).

Looking for similar performance at lower cost?
Llama 3.1 8B Instruct scores 34.3 (103% as good) at $0.02/1M input · 87% cheaper
Capabilities
coding
7.7
#139 globally
reasoning
39.6
#67 globally
math
44.7
#93 globally
knowledge
46.3
#118 globally
multimodal
53.1
#7 globally
language
78.2
#54 globally
Benchmark Scores
Compare All
Tested on 20 benchmarks · Ranked across 7 categories
Score Distribution (all 233 models)
0255075100
▲ You are here
WeirdML

Unusual and adversarial machine learning challenges. Tests robustness of reasoning about edge cases in ML systems.

11.8
Aider polyglot

Multi-language code editing from Aider. Tests editing ability across Python, JavaScript, TypeScript, Java, C++, Go, Rust, and more.

3.6
HELM — WildBench

Stanford HELM WildBench evaluation. Tests reasoning on challenging real-world tasks.

79.1
ARC-AGI-2

ARC-AGI 2, harder sequel to ARC. More complex abstract reasoning patterns that test generalization ability beyond training data.

0.1
GSM8K

Grade school math word problems. 8,500 problems testing multi-step arithmetic reasoning. A foundational math benchmark.

91.3
MATH level 5

Competition-level math from AMC, AIME, and olympiad problems. Level 5 is the hardest tier, requiring creative problem-solving.

52.6
HELM — Omni-MATH

Stanford HELM evaluation of mathematical reasoning across diverse problem types.

28.0
Excellent (85+) Good (70-85) Average (50-70) Below (<50)
Links
Documentation
Community
BenchGecko API
gpt-4o-mini-2024-07-18
Specifications
  • Typemultimodal
  • Context128K tokens (~64 books)
  • ReleasedJul 2024
  • LicenseProprietary
  • StatusActive
  • Cost / Message~$0.001
Available On
OpenAI logoOpenAI$0.15
Share & Export
Tweet
GPT-4o-mini (2024-07-18) is a proprietary multimodal AI model by OpenAI, released in July 2024. It has an average benchmark score of 33.4. Context window: 128K tokens.