How much does Kimi K2.5 cost?

Kimi K2.5 costs $0.44 per million input tokens and $2.00 per million output tokens. For a typical conversation (~2,000 tokens), that's approximately $0.003 per message.

What benchmarks has Kimi K2.5 been tested on?

Kimi K2.5 has been evaluated on 27 benchmarks. Top scores: OpenCompass — IFEval: 93.9, OTIS Mock AIME 2024-2025: 92.2, OpenCompass — AIME2025: 91.9.

How does Kimi K2.5 compare to Qwen3 235B A22B Thinking 2507?

Kimi K2.5 has an average score of 59.6 while Qwen3 235B A22B Thinking 2507 scores 59.4. Kimi K2.5 outperforms Qwen3 235B A22B Thinking 2507 overall. Kimi K2.5 costs $0.44/1M input vs Qwen3 235B A22B Thinking 2507 at $0.15/1M input. See full comparison →

Home/Models/Kimi K2.5

Kimi K2.5

Name: Kimi K2.5
Price: 0.44 USD
Author: moonshotai

by moonshotai · Released Jan 2026

Open SourceMultimodal

59.6

avg score

Rank #68

Compare

Better than 71% of all models

Context

262K tokens (~131 books)

Input $/1M

$0.44

Output $/1M

$2.00

Type

multimodal

License

Open Source

Benchmarks

27 tested

Data updated today

About

Kimi K2.5 is Moonshot AI's native multimodal model, delivering state-of-the-art visual coding capability and a self-directed agent swarm paradigm. Built on Kimi K2 with continued pretraining over approximately 15T mixed...

Tested on 27 benchmarks with 52.0% average. Top scores: OpenCompass — IFEval (93.9%), OTIS Mock AIME 2024-2025 (92.2%), OpenCompass — AIME2025 (91.9%).

Looking for similar performance at lower cost?
Qwen3 235B A22B Thinking 2507 scores 59.4 (100% as good) at $0.15/1M input · 66% cheaper

Capabilities

coding

60.8

#36 globally

reasoning

37.8

#70 globally

math

54.0

#65 globally

knowledge

50.4

#98 globally

agentic

38.9

#9 globally

speed

80.7

#13 globally

language

93.9

#3 globally

Benchmark Scores

Compare All

Tested on 27 benchmarks · Ranked across 7 categories

Score Distribution (all 233 models)

0255075100

▲ You are here

codingCompare coding →

OpenCompass — LiveCodeBenchV6

OpenCompass Live Code Bench v6. Fresh competitive programming problems to evaluate code generation without memorization.

80.6—

SWE-Bench verified

Real-world software engineering tasks from GitHub issues. Models must diagnose bugs and write patches that pass test suites. Human-verified subset of SWE-bench.

73.8—

WeirdML

Unusual and adversarial machine learning challenges. Tests robustness of reasoning about edge cases in ML systems.

45.6—

reasoningCompare reasoning →

ARC-AGI

Abstraction and Reasoning Corpus. Tests fluid intelligence through novel visual pattern recognition puzzles. Core measure of general intelligence.

65.3—

SimpleBench

Deceptively simple questions that humans find easy but AI models often get wrong. Tests common sense and reasoning gaps.

36.2—

ARC-AGI-2

ARC-AGI 2, harder sequel to ARC. More complex abstract reasoning patterns that test generalization ability beyond training data.

11.8—

mathCompare math →

OTIS Mock AIME 2024-2025

Mock AIME (American Invitational Mathematics Exam) problems from OTIS. Tests mathematical competition performance.

92.2—

OpenCompass — AIME2025

OpenCompass evaluation on AIME 2025 problems. Tests mathematical reasoning on fresh competition problems.

91.9—

FrontierMath-2025-02-28-Private

Original research-level math problems created by professional mathematicians. Problems are unpublished and cannot be memorized.

27.9—

Quick compare:

vs Qwen3 235B A22B Thinking 2507

vs o1

vs Gemini 3 Flash Preview

Excellent (85+) Good (70-85) Average (50-70) Below (<50)

Similar Models

Qwen3 235B A22B Thinking 2507

Gemini 3 Flash Preview

Google DeepMind

59.1$0.50/1M

Links

Info

moonshotai Pricing explorer Developers · API

Research

Documentation

Community

Source Code

BenchGecko API

kimi-k2-5

Specifications

Typemultimodal
Context262K tokens (~131 books)
ReleasedJan 2026
LicenseOpen Source
StatusActive
Cost / Message~$0.003

Available On

moonshotai$0.44

Frequently Asked Questions

Kimi K2.5 is an open-source multimodal AI model by moonshotai, released in January 2026. It has an average benchmark score of 59.6. Context window: 262K tokens.

Benchmarks

OpenCompass — IFEval OTIS Mock AIME 2024-2025 OpenCompass — AIME2025 OpenCompass — GPQA-Diamond OpenCompass — MMLU-Pro

moonshotai · Provider moonshotai · Economy All Models Compare Models Pricing Developers · API Context Window · Glossary

Kimi K2.5

Frequently Asked Questions

Related Models

Benchmarks

Related Pages