Qwen3 4B Thinking 2507
Código abiertopor Alibaba · Publicado el 2025-08-05
60.6
puntuación promedio
N/A
Precio de entrada
N/A
Precio de salida
N/A
Ventana de contexto
text-generation
Tipo
Tested on 6 benchmarks with 60.6% average. Top scores: OpenCompass — IFEval (88.5%), OpenCompass — AIME2025 (80.0%), OpenCompass — MMLU-Pro (72.8%).
Puntuaciones de benchmark
| Benchmark | Categoría | Puntuación | Bar |
|---|---|---|---|
| OpenCompass — IFEval | language | 88.5 | |
| OpenCompass — AIME2025 | math | 80.0 | |
| OpenCompass — MMLU-Pro | knowledge | 72.8 | |
| OpenCompass — GPQA-Diamond | knowledge | 64.7 | |
| OpenCompass — LiveCodeBenchV6 | coding | 51.6 | |
| OpenCompass — HLE | knowledge | 6.0 |
Modelos similares
Google DeepMind
60.6
Alibaba Qwen
60.7
Google DeepMind
60.5
OpenAI
60.4