베타
리더보드/Gemini 1.5 Pro (Feb 2024)
Google DeepMind logo

Gemini 1.5 Pro (Feb 2024)

제공 Google DeepMind · 출시일 2024-01-01

41.3
평균 점수
N/A
입력 가격
N/A
출력 가격
N/A
컨텍스트 윈도우
text
유형

Tested on 20 benchmarks with 41.3% average. Top scores: Chatbot Arena Elo — Overall (1322.5%), HELM — IFEval (83.7%), HELM — WildBench (81.3%).

벤치마크카테고리점수Bar
Chatbot Arena Elo — Overallarena1322.5
HELM — IFEvallanguage83.7
HELM — WildBenchreasoning81.3
BBHreasoning78.7
MMLUknowledge76.9
HELM — MMLU-Proknowledge73.7
VideoMMEmultimodal66.7
Aider — Code Editingcoding57.1
HELM — GPQAknowledge53.4
MATH level 5math40.8
HELM — Omni-MATHmath36.4
CadEvalcoding34.0
GPQA diamondknowledge27.8
WeirdMLcoding22.2
Balrogknowledge21.0
SimpleBenchreasoning12.5
Cybenchcoding7.5
OTIS Mock AIME 2024-2025math6.7
The Agent Companyagentic3.4
ARC-AGI-2reasoning0.8