ベータ
OpenAI logo

o3

開発元 OpenAI · リリース日 2025-04-16

55.2
平均スコア
$2.00/1M
入力料金
$8.00/1M
出力料金
200K tokens (~100 books)
コンテキストウィンドウ
multimodal
タイプ

Tested on 33 benchmarks with 55.2% average. Top scores: MATH level 5 (97.8%), Fiction.LiveBench (88.9%), HELM — IFEval (86.9%).

ベンチマークカテゴリスコアBar
MATH level 5math97.8
Fiction.LiveBenchknowledge88.9
HELM — IFEvallanguage86.9
HELM — WildBenchreasoning86.1
HELM — MMLU-Proknowledge85.9
Lech Mazur Writingknowledge83.9
OTIS Mock AIME 2024-2025math83.9
Aider polyglotcoding81.3
GPQA diamondknowledge75.8
HELM — GPQAknowledge75.3
CadEvalcoding74.0
GeoBenchknowledge74.0
HELM — Omni-MATHmath71.4
SWE-Bench verifiedcoding62.3
ARC-AGIreasoning60.8
SWE-Bench Verified (Bash Only)coding58.4
SimpleQA Verifiedknowledge53.0
WeirdMLcoding52.4
Professional Reasoning — Legalknowledge48.6
Professional Reasoning — Financeknowledge47.7
DeepResearch Benchknowledge46.6
SimpleBenchreasoning43.7
Artificial Analysis — Coding Indexspeed38.4
Artificial Analysis — Quality Indexspeed38.4
Artificial Analysis — Agentic Indexspeed36.1
VPCTknowledge28.0
OSWorldagentic23.0
FrontierMath-2025-02-28-Privatemath18.7
HLEknowledge16.3
EnigmaEvalknowledge13.1
GSO-Benchcoding8.8
ARC-AGI-2reasoning6.5
FrontierMath-Tier-4-2025-07-01-Privatemath2.1