How much does Qwen 3.5 Plus (hosted 397B-A17B) cost?

Qwen 3.5 Plus (hosted 397B-A17B) is open source and can be self-hosted.

What benchmarks has Qwen 3.5 Plus (hosted 397B-A17B) been tested on?

Qwen 3.5 Plus (hosted 397B-A17B) has been evaluated on 7 benchmarks. Top scores: OTIS Mock AIME 2024-2025: 85.0, GPQA diamond: 78.9, SimpleQA Verified: 26.0.

Is Qwen 3.5 Plus (hosted 397B-A17B) open source?

Yes, Qwen 3.5 Plus (hosted 397B-A17B) is open source.

How does Qwen 3.5 Plus (hosted 397B-A17B) compare to R1 Distill Qwen 32B?

Qwen 3.5 Plus (hosted 397B-A17B) has an average score of 42.0 while R1 Distill Qwen 32B scores 42.2. R1 Distill Qwen 32B slightly outperforms Qwen 3.5 Plus (hosted 397B-A17B) overall. See full comparison →

Home/Models/Qwen 3.5 Plus (hosted 397B-A17B)

Qwen 3.5 Plus (hosted 397B-A17B)

Name: Qwen 3.5 Plus (hosted 397B-A17B)
Author: Alibaba Qwen

by Alibaba Qwen · Released Jan 2024

Open Source

42.0

avg score

Rank #165

Compare

Better than 40% of all models

Context

N/A

Input $/1M

TBD

Output $/1M

TBD

Type

text

License

Open Source

Benchmarks

7 tested

Data updated today

About

Tested on 7 benchmarks with 34.8% average. Top scores: OTIS Mock AIME 2024-2025 (85.0%), GPQA diamond (78.9%), SimpleQA Verified (26.0%).

Capabilities

math

36.0

#139 globally

knowledge

40.6

#179 globally

agentic

13.6

#33 globally

Benchmark Scores

Compare All

Tested on 7 benchmarks · Ranked across 3 categories

Score Distribution (all 274 models)

0255075100

▲ You are here

mathCompare math →

OTIS Mock AIME 2024-2025

Mock AIME (American Invitational Mathematics Exam) problems from OTIS. Tests mathematical competition performance.

85.0—

FrontierMath-2025-02-28-Private

Original research-level math problems created by professional mathematicians. Problems are unpublished and cannot be memorized.

21.0—

FrontierMath-Tier-4-2025-07-01-Private

Hardest tier of FrontierMath. Problems at the frontier of human mathematical ability, many unsolved by most mathematicians.

2.1—

knowledgeCompare knowledge →

GPQA diamond

Graduate-level science questions written by PhD experts. Diamond subset contains questions where experts disagree, testing deep understanding.

78.9—

SimpleQA Verified

Simple factual questions with verified correct answers. Tests accuracy of basic knowledge retrieval. Low scores indicate hallucination.

26.0—

Chess Puzzles

Tactical chess puzzles testing pattern recognition and multi-move calculation. Measures strategic reasoning ability.

17.0—

agenticCompare agentic →

APEX-Agents

Agent performance evaluation testing multi-step tool use, planning, and execution in realistic environments.

13.6—

Quick compare:

vs R1 Distill Qwen 32B

vs Gemma 2 2b It

vs Llama 3 8B Instruct

Excellent (85+) Good (70-85) Average (50-70) Below (<50)

Similar Models

Frequently Asked Questions

Qwen 3.5 Plus (hosted 397B-A17B) is an open-source text AI model by Alibaba Qwen, released in January 2024. It has an average benchmark score of 42.0.

Benchmarks

OTIS Mock AIME 2024-2025 GPQA diamond SimpleQA Verified FrontierMath-2025-02-28-Private Chess Puzzles

Alibaba Qwen · Provider Alibaba Qwen · Economy All Models Compare Models Pricing Developers · API

Qwen 3.5 Plus (hosted 397B-A17B)

Frequently Asked Questions

Related Models

Benchmarks

Related Pages