Qwen3-235B-A22B-Thinking-2507 is a high-performance, open-weight Mixture-of-Experts (MoE) language model optimized for complex reasoning tasks. It activates 22B of its 235B parameters per forward pass and natively supports up to 262,144...
Tested on 24 benchmarks with 55.9% average. Top scores: Chatbot Arena Elo — Overall (1399.8%), OpenCompass — AIME2025 (90.9%), OpenCompass — IFEval (87.8%).
OpenCompass Live Code Bench v6. Fresh competitive programming problems to evaluate code generation without memorization.
Regularly refreshed coding problems that avoid data contamination. New problems added monthly to prevent memorization.
Unusual and adversarial machine learning challenges. Tests robustness of reasoning about edge cases in ML systems.
Regularly refreshed reasoning problems testing logical deduction, spatial reasoning, and analytical thinking.
Fresh data analysis tasks testing ability to interpret tables, charts, and statistical data.
OpenCompass evaluation on AIME 2025 problems. Tests mathematical reasoning on fresh competition problems.
Mock AIME (American Invitational Mathematics Exam) problems from OTIS. Tests mathematical competition performance.
Regularly updated math problems that test numerical reasoning, algebra, calculus, and combinatorics.
- Typetext
- Context131K tokens (~66 books)
- ReleasedJul 2025
- LicenseOpen Source
- StatusActive
- Cost / Message~$0.002