Qwen3 235B A22B Thinking 2507
オープンソース開発元 Alibaba Qwen · リリース日 2025-07-25
55.9
平均スコア
$0.13/1M
入力料金
$0.60/1M
出力料金
262K tokens (~131 books)
コンテキストウィンドウ
text
タイプ
Tested on 24 benchmarks with 55.9% average. Top scores: Chatbot Arena Elo — Overall (1399.8%), OpenCompass — AIME2025 (90.9%), OpenCompass — IFEval (87.8%).
ベンチマークスコア
| ベンチマーク | カテゴリ | スコア | Bar |
|---|---|---|---|
| Chatbot Arena Elo — Overall | arena | 1399.8 | |
| OpenCompass — AIME2025 | math | 90.9 | |
| OpenCompass — IFEval | language | 87.8 | |
| OTIS Mock AIME 2024-2025 | math | 86.7 | |
| Lech Mazur Writing | knowledge | 85.0 | |
| OpenCompass — MMLU-Pro | knowledge | 83.5 | |
| OpenCompass — GPQA-Diamond | knowledge | 79.8 | |
| Fiction.LiveBench | knowledge | 75.0 | |
| GPQA diamond | knowledge | 73.4 | |
| LiveBench — Mathematics | math | 73.4 | |
| OpenCompass — LiveCodeBenchV6 | coding | 70.6 | |
| LiveBench — Language | language | 69.5 | |
| LiveBench — Coding | coding | 69.0 | |
| LiveBench — Reasoning | reasoning | 59.4 | |
| LiveBench — Overall | knowledge | 53.0 | |
| LiveBench — Data Analysis | reasoning | 52.2 | |
| SimpleQA Verified | knowledge | 50.1 | |
| WeirdML | coding | 41.0 | |
| LiveBench — If | language | 40.6 | |
| OpenCompass — HLE | knowledge | 18.5 | |
| Chess Puzzles | knowledge | 12.0 | |
| FrontierMath-2025-02-28-Private | math | 8.5 | |
| LiveBench — Agentic Coding | coding | 6.7 | |
| FrontierMath-Tier-4-2025-07-01-Private | math | 0.1 |
類似モデル
Mistral AI
55.8
OpenAI
56.0
DeepSeek
56.0
OpenAI
56.2