ベータ
ランキング/GPT-4o (2024-11-20)
OpenAI logo

GPT-4o (2024-11-20)

開発元 OpenAI · リリース日 2024-11-20

37.7
平均スコア
$2.50/1M
入力料金
$10.00/1M
出力料金
128K tokens (~64 books)
コンテキストウィンドウ
multimodal
タイプ

Tested on 28 benchmarks with 37.7% average. Top scores: ScienceQA (84.7%), HELM — WildBench (82.8%), Lech Mazur Writing (81.8%).

ベンチマークカテゴリスコアBar
ScienceQAknowledge84.7
HELM — WildBenchreasoning82.8
Lech Mazur Writingknowledge81.8
HELM — IFEvallanguage81.7
MMLUknowledge79.1
Aider — Code Editingcoding71.4
HELM — MMLU-Proknowledge71.3
GeoBenchknowledge71.0
VideoMMEmultimodal62.5
MATH level 5math53.3
HELM — GPQAknowledge52.0
Balrogknowledge32.3
GPQA diamondknowledge32.3
SWE-Bench verifiedcoding31.0
HELM — Omni-MATHmath29.3
CadEvalcoding26.0
WeirdMLcoding25.1
Aider polyglotcoding23.1
SWE-Bench Verified (Bash Only)coding21.6
Cybenchcoding12.5
VPCTknowledge10.0
The Agent Companyagentic8.6
OTIS Mock AIME 2024-2025math6.3
ARC-AGIreasoning4.5
SimpleBenchreasoning1.4
FrontierMath-2025-02-28-Privatemath0.3
GSO-Benchcoding0.1
ARC-AGI-2reasoning0.1