Beta
Home/Comparar/Claude 3.5 Sonnet vs gpt-oss-120b

Claude 3.5 Sonnet vs gpt-oss-120b

Lado a lado. Cada métrica. Cada benchmark.

Anthropic
42.3
puntuación promedio
6/13
benchmarks
OpenAI
46.9
puntuación promedio
7/13
benchmarks
TipoClaude 3.5 Sonnetgpt-oss-120b
ProviderAnthropic logoAnthropicOpenAI logoOpenAI
puntuación promedio42.346.9
Precio de entrada-$0.04
Precio de salida-$0.19
Ventana de contexto-131K tokens (~66 books)
Publicado el2024-01-012025-08-05
Código abiertoProprietaryOpen Source

13 benchmarks · Claude 3.5 Sonnet: 6, gpt-oss-120b: 7

BenchmarkCategoríaClaude 3.5 Sonnetgpt-oss-120b
Aider polyglotcoding51.641.8
Chatbot Arena Elo — Overallarena1371.41353.8
Fortresssafety13.08.2
GPQA diamondknowledge38.767.7
HELM — GPQAknowledge56.568.4
HELM — IFEvallanguage85.683.6
HELM — MMLU-Proknowledge77.779.5
HELM — Omni-MATHmath27.668.8
HELM — WildBenchreasoning79.284.5
Lech Mazur Writingknowledge80.377.3
OTIS Mock AIME 2024-2025math6.488.9
SimpleBenchreasoning13.06.5
WeirdMLcoding31.048.2