DeepSeek-V3.2 is a large language model designed to harmonize high computational efficiency with strong reasoning and agentic tool-use performance. It introduces DeepSeek Sparse Attention (DSA), a fine-grained sparse attention mechanism...
Tested on 29 benchmarks with 53.0% average. Top scores: Chatbot Arena Elo — Overall (1424.4%), Chatbot Arena Elo — Coding (1326.9%), OpenCompass — AIME2025 (93.0%).
Regularly refreshed coding problems that avoid data contamination. New problems added monthly to prevent memorization.
OpenCompass Live Code Bench v6. Fresh competitive programming problems to evaluate code generation without memorization.
Multi-language code editing from Aider. Tests editing ability across Python, JavaScript, TypeScript, Java, C++, Go, Rust, and more.
Abstraction and Reasoning Corpus. Tests fluid intelligence through novel visual pattern recognition puzzles. Core measure of general intelligence.
Fresh data analysis tasks testing ability to interpret tables, charts, and statistical data.
Regularly refreshed reasoning problems testing logical deduction, spatial reasoning, and analytical thinking.
OpenCompass evaluation on AIME 2025 problems. Tests mathematical reasoning on fresh competition problems.
Mock AIME (American Invitational Mathematics Exam) problems from OTIS. Tests mathematical competition performance.
Regularly updated math problems that test numerical reasoning, algebra, calculus, and combinatorics.
- Typetext
- Context131K tokens (~66 books)
- ReleasedDec 2025
- LicenseOpen Source
- StatusActive
- Cost / Message~$0.001