Qwen3-235B-A22B-Instruct-2507 is a multilingual, instruction-tuned mixture-of-experts language model based on the Qwen3-235B architecture, with 22B active parameters per forward pass. It is optimized for general-purpose text generation, including instruction following,...
Tested on 20 benchmarks with 48.5% average. Top scores: Chatbot Arena Elo — Overall (1422.6%), OpenCompass — IFEval (88.3%), OpenCompass — MMLU-Pro (79.2%).
Regularly refreshed coding problems that avoid data contamination. New problems added monthly to prevent memorization.
Multi-language code editing from Aider. Tests editing ability across Python, JavaScript, TypeScript, Java, C++, Go, Rust, and more.
OpenCompass Live Code Bench v6. Fresh competitive programming problems to evaluate code generation without memorization.
Regularly refreshed reasoning problems testing logical deduction, spatial reasoning, and analytical thinking.
Fresh data analysis tasks testing ability to interpret tables, charts, and statistical data.
Abstraction and Reasoning Corpus. Tests fluid intelligence through novel visual pattern recognition puzzles. Core measure of general intelligence.
OpenCompass evaluation on AIME 2025 problems. Tests mathematical reasoning on fresh competition problems.
Regularly updated math problems that test numerical reasoning, algebra, calculus, and combinatorics.
- Typetext
- Context262K tokens (~131 books)
- ReleasedJul 2025
- LicenseOpen Source
- StatusActive
- Cost / Message~$0.000