Rangliste/Gemini 1.5 Flash (May 2024)

Gemini 1.5 Flash (May 2024)

Name: Gemini 1.5 Flash (May 2024)
Author: Google DeepMind

von Google DeepMind · Veroeffentlicht 2024-01-01

47.4

Durchschn. Score

N/A

Eingabepreis

N/A

Ausgabepreis

N/A

Kontextfenster

text

Typ

Tested on 17 benchmarks with 47.4% average. Top scores: Chatbot Arena Elo — Overall (1285.1%), HELM — IFEval (83.1%), GSM8K (82.4%).

Benchmark-Ergebnisse

Benchmark	Kategorie	Score
Chatbot Arena Elo — Overall	arena	1285.1
HELM — IFEval	language	83.1
GSM8K	math	82.4
HELM — WildBench	reasoning	79.2
GeoBench	knowledge	76.0
PIQA	knowledge	75.0
MMLU	knowledge	70.5
HELM — MMLU-Pro	knowledge	67.8
VideoMME	multimodal	60.4
HELM — GPQA	knowledge	43.7
HELM — Omni-MATH	math	30.5
MATH level 5	math	25.1
WeirdML	coding	24.9
GPQA diamond	knowledge	20.5
Balrog	knowledge	14.6
OTIS Mock AIME 2024-2025	math	3.8
FrontierMath-2025-02-28-Private	math	0.1