测试版
排行榜/Qwen3 235B A22B Thinking 2507
Alibaba Qwen logo

Qwen3 235B A22B Thinking 2507

开源

来自 Alibaba Qwen · 发布于 2025-07-25

55.9
平均分
$0.13/1M
输入价格
$0.60/1M
输出价格
262K tokens (~131 books)
上下文窗口
text
类型

Tested on 24 benchmarks with 55.9% average. Top scores: Chatbot Arena Elo — Overall (1399.8%), OpenCompass — AIME2025 (90.9%), OpenCompass — IFEval (87.8%).

基准测试类别分数Bar
Chatbot Arena Elo — Overallarena1399.8
OpenCompass — AIME2025math90.9
OpenCompass — IFEvallanguage87.8
OTIS Mock AIME 2024-2025math86.7
Lech Mazur Writingknowledge85.0
OpenCompass — MMLU-Proknowledge83.5
OpenCompass — GPQA-Diamondknowledge79.8
Fiction.LiveBenchknowledge75.0
GPQA diamondknowledge73.4
LiveBench — Mathematicsmath73.4
OpenCompass — LiveCodeBenchV6coding70.6
LiveBench — Languagelanguage69.5
LiveBench — Codingcoding69.0
LiveBench — Reasoningreasoning59.4
LiveBench — Overallknowledge53.0
LiveBench — Data Analysisreasoning52.2
SimpleQA Verifiedknowledge50.1
WeirdMLcoding41.0
LiveBench — Iflanguage40.6
OpenCompass — HLEknowledge18.5
Chess Puzzlesknowledge12.0
FrontierMath-2025-02-28-Privatemath8.5
LiveBench — Agentic Codingcoding6.7
FrontierMath-Tier-4-2025-07-01-Privatemath0.1