Grok 4

por xAI · Lançado em 2025-07-09

47.8

pontuação média

$3.00/1M

Preço de entrada

$15.00/1M

Preço de saída

256K tokens (~128 books)

Janela de contexto

multimodal

Tipo

Tested on 17 benchmarks with 47.8% average. Top scores: Fiction.LiveBench (94.4%), OTIS Mock AIME 2024-2025 (84.0%), GPQA diamond (82.7%).

Pontuações de benchmark

Benchmark	Categoria	Pontuação
Fiction.LiveBench	knowledge	94.4
OTIS Mock AIME 2024-2025	math	84.0
GPQA diamond	knowledge	82.7
Lech Mazur Writing	knowledge	80.7
Aider polyglot	coding	79.6
SimpleBench	reasoning	52.6
SimpleQA Verified	knowledge	47.9
DeepResearch Bench	knowledge	47.9
WeirdML	coding	45.7
GeoBench	knowledge	45.0
Balrog	knowledge	43.6
Chess Puzzles	knowledge	28.0
Terminal Bench	coding	27.2
FrontierMath-2025-02-28-Private	math	19.7
ARC-AGI-2	reasoning	16.0
APEX-Agents	agentic	15.2
FrontierMath-Tier-4-2025-07-01-Private	math	2.1