实时正在追踪来自268家提供商的976个AI模型。

BenchGecko测试版

模型976·提供商268·基准测试128·公司71·智能体165·榜首Qwen3 VL 235B A22B Instruct · 1415.8%·已更新1小时前·数据点2,902·MCP服务器4,923

排行榜/o1-mini

o1-mini

来自 OpenAI · 发布于 2024-01-01

34.9

平均分

N/A

输入价格

N/A

输出价格

N/A

上下文窗口

text

类型

Tested on 13 benchmarks with 34.9% average. Top scores: Chatbot Arena Elo — Overall (1336.6%), MATH level 5 (89.2%), Aider — Code Editing (70.7%).

基准测试分数

基准测试	类别	分数	Bar
Chatbot Arena Elo — Overall	arena	1336.6
MATH level 5	math	89.2
Aider — Code Editing	coding	70.7
Lech Mazur Writing	knowledge	64.9
GPQA diamond	knowledge	49.8
OTIS Mock AIME 2024-2025	math	46.9
WeirdML	coding	36.3
Aider polyglot	coding	32.9
ARC-AGI	reasoning	14.0
Cybench	coding	10.0
SimpleBench	reasoning	1.7
FrontierMath-2025-02-28-Private	math	1.7
ARC-AGI-2	reasoning	0.8

相似模型

WizardLM-2 8x22B

Qwen2.5 7B Instruct

OpenAI o1 时间线

$15.00/M in200Kctx14 benchmarks

o1-miniJan 2024

N/AN/Actx13 benchmarks

o1-previewJan 2024

N/AN/Actx9 benchmarks

$150.00/M in200Kctx