베타
xAI logo

Grok 4

제공 xAI · 출시일 2025-07-09

54.8
평균 점수
$3.00/1M
입력 가격
$15.00/1M
출력 가격
256K tokens (~128 books)
컨텍스트 윈도우
multimodal
유형

Tested on 24 benchmarks with 54.8% average. Top scores: HELM — IFEval (94.9%), Fiction.LiveBench (94.4%), HELM — MMLU-Pro (85.1%).

벤치마크카테고리점수Bar
HELM — IFEvallanguage94.9
Fiction.LiveBenchknowledge94.4
HELM — MMLU-Proknowledge85.1
OTIS Mock AIME 2024-2025math84.0
GPQA diamondknowledge82.7
Lech Mazur Writingknowledge80.7
HELM — WildBenchreasoning79.7
Aider polyglotcoding79.6
HELM — GPQAknowledge72.6
ARC-AGIreasoning66.7
HELM — Omni-MATHmath60.3
SimpleBenchreasoning52.6
DeepResearch Benchknowledge47.9
SimpleQA Verifiedknowledge47.9
WeirdMLcoding45.7
GeoBenchknowledge45.0
Balrogknowledge43.6
Cybenchcoding43.0
Chess Puzzlesknowledge28.0
Terminal Benchcoding27.2
FrontierMath-2025-02-28-Privatemath19.7
ARC-AGI-2reasoning16.0
APEX-Agentsagentic15.2
FrontierMath-Tier-4-2025-07-01-Privatemath2.1