Claude 3.5 Sonnet vs gpt-oss-120b
Lado a lado. Cada métrica. Cada benchmark.
| Tipo | Claude 3.5 Sonnet | gpt-oss-120b |
|---|---|---|
| Provider | ||
| pontuação média | 42.3 | 46.9 |
| Preço de entrada | - | $0.04 |
| Preço de saída | - | $0.19 |
| Janela de contexto | - | 131K tokens (~66 books) |
| Lançado em | 2024-01-01 | 2025-08-05 |
| Código aberto | Proprietary | Open Source |
Pontuações de benchmark
13 benchmarks · Claude 3.5 Sonnet: 6, gpt-oss-120b: 7
| Benchmark | Categoria | Claude 3.5 Sonnet | gpt-oss-120b |
|---|---|---|---|
| Aider polyglot | coding | 51.6 | 41.8 |
| Chatbot Arena Elo — Overall | arena | 1371.4 | 1353.8 |
| Fortress | safety | 13.0 | 8.2 |
| GPQA diamond | knowledge | 38.7 | 67.7 |
| HELM — GPQA | knowledge | 56.5 | 68.4 |
| HELM — IFEval | language | 85.6 | 83.6 |
| HELM — MMLU-Pro | knowledge | 77.7 | 79.5 |
| HELM — Omni-MATH | math | 27.6 | 68.8 |
| HELM — WildBench | reasoning | 79.2 | 84.5 |
| Lech Mazur Writing | knowledge | 80.3 | 77.3 |
| OTIS Mock AIME 2024-2025 | math | 6.4 | 88.9 |
| SimpleBench | reasoning | 13.0 | 6.5 |
| WeirdML | coding | 31.0 | 48.2 |