Qwen3-32B is a dense 32.8B parameter causal language model from the Qwen3 series, optimized for both complex reasoning and efficient dialogue. It supports seamless switching between a "thinking" mode for...
Tested on 8 benchmarks with 58.2% average. Top scores: Chatbot Arena Elo — Overall (1347.0%), OpenCompass — IFEval (86.0%), OpenCompass — MMLU-Pro (78.0%).
OpenCompass Live Code Bench v6. Fresh competitive programming problems to evaluate code generation without memorization.
Multi-language code editing from Aider. Tests editing ability across Python, JavaScript, TypeScript, Java, C++, Go, Rust, and more.
OpenCompass evaluation on AIME 2025 problems. Tests mathematical reasoning on fresh competition problems.
OpenCompass MMLU-Pro evaluation. Harder knowledge test with more answer choices.
OpenCompass evaluation of GPQA Diamond. PhD-level science questions from the hardest subset.
OpenCompass evaluation of Humanitys Last Exam. Expert-level cross-discipline knowledge test.
- Typetext
- Context41K tokens (~20 books)
- ReleasedApr 2025
- LicenseOpen Source
- StatusActive
- Cost / Message~$0.000