베타
리더보드/GPT-4o (2024-11-20)
OpenAI logo

GPT-4o (2024-11-20)

제공 OpenAI · 출시일 2024-11-20

37.7
평균 점수
$2.50/1M
입력 가격
$10.00/1M
출력 가격
128K tokens (~64 books)
컨텍스트 윈도우
multimodal
유형

Tested on 28 benchmarks with 37.7% average. Top scores: ScienceQA (84.7%), HELM — WildBench (82.8%), Lech Mazur Writing (81.8%).

벤치마크카테고리점수Bar
ScienceQAknowledge84.7
HELM — WildBenchreasoning82.8
Lech Mazur Writingknowledge81.8
HELM — IFEvallanguage81.7
MMLUknowledge79.1
Aider — Code Editingcoding71.4
HELM — MMLU-Proknowledge71.3
GeoBenchknowledge71.0
VideoMMEmultimodal62.5
MATH level 5math53.3
HELM — GPQAknowledge52.0
Balrogknowledge32.3
GPQA diamondknowledge32.3
SWE-Bench verifiedcoding31.0
HELM — Omni-MATHmath29.3
CadEvalcoding26.0
WeirdMLcoding25.1
Aider polyglotcoding23.1
SWE-Bench Verified (Bash Only)coding21.6
Cybenchcoding12.5
VPCTknowledge10.0
The Agent Companyagentic8.6
OTIS Mock AIME 2024-2025math6.3
ARC-AGIreasoning4.5
SimpleBenchreasoning1.4
FrontierMath-2025-02-28-Privatemath0.3
GSO-Benchcoding0.1
ARC-AGI-2reasoning0.1