Qwen3 235B A22B Instruct 2507
オープンソース開発元 Alibaba Qwen · リリース日 2025-07-21
48.5
平均スコア
$0.07/1M
入力料金
$0.10/1M
出力料金
262K tokens (~131 books)
コンテキストウィンドウ
text
タイプ
Tested on 20 benchmarks with 48.5% average. Top scores: Chatbot Arena Elo — Overall (1422.6%), OpenCompass — IFEval (88.3%), OpenCompass — MMLU-Pro (79.2%).
ベンチマークスコア
| ベンチマーク | カテゴリ | スコア | Bar |
|---|---|---|---|
| Chatbot Arena Elo — Overall | arena | 1422.6 | |
| OpenCompass — IFEval | language | 88.3 | |
| OpenCompass — MMLU-Pro | knowledge | 79.2 | |
| OpenCompass — GPQA-Diamond | knowledge | 75.5 | |
| LiveBench — Coding | coding | 69.6 | |
| OpenCompass — AIME2025 | math | 69.5 | |
| LiveBench — Mathematics | math | 68.0 | |
| LiveBench — Language | language | 66.1 | |
| Aider polyglot | coding | 59.6 | |
| LiveBench — Reasoning | reasoning | 58.4 | |
| Fiction.LiveBench | knowledge | 52.9 | |
| LiveBench — Overall | knowledge | 48.8 | |
| LiveBench — Data Analysis | reasoning | 44.7 | |
| OpenCompass — LiveCodeBenchV6 | coding | 43.0 | |
| WeirdML | coding | 38.7 | |
| LiveBench — If | language | 21.7 | |
| LiveBench — Agentic Coding | coding | 13.3 | |
| OpenCompass — HLE | knowledge | 12.3 | |
| ARC-AGI | reasoning | 11.0 | |
| ARC-AGI-2 | reasoning | 1.3 |
類似モデル
HA
Qwen2.5 72B Instruct AbliteratedHuiHui AI
48.1
Google DeepMind
48.0
Google DeepMind
49.1
U
Stable Beluga 2Unknown
47.8