Beta
Classifica/Gemini 1.5 Pro (Feb 2024)
Google DeepMind logo

Gemini 1.5 Pro (Feb 2024)

di Google DeepMind · Rilascio 2024-01-01

41.3
punteggio medio
N/A
Prezzo Input
N/A
Prezzo Output
N/A
Finestra di Contesto
text
Tipo

Tested on 20 benchmarks with 41.3% average. Top scores: Chatbot Arena Elo — Overall (1322.5%), HELM — IFEval (83.7%), HELM — WildBench (81.3%).

BenchmarkCategoriaPunteggioBar
Chatbot Arena Elo — Overallarena1322.5
HELM — IFEvallanguage83.7
HELM — WildBenchreasoning81.3
BBHreasoning78.7
MMLUknowledge76.9
HELM — MMLU-Proknowledge73.7
VideoMMEmultimodal66.7
Aider — Code Editingcoding57.1
HELM — GPQAknowledge53.4
MATH level 5math40.8
HELM — Omni-MATHmath36.4
CadEvalcoding34.0
GPQA diamondknowledge27.8
WeirdMLcoding22.2
Balrogknowledge21.0
SimpleBenchreasoning12.5
Cybenchcoding7.5
OTIS Mock AIME 2024-2025math6.7
The Agent Companyagentic3.4
ARC-AGI-2reasoning0.8