Beta
Compare · ModelsLive · 2 picked · head to head

Kimi K2.5 vs GLM 4.7

Side by side · benchmarks, pricing, and signals you can act on.

Winner summary

Kimi K2.5 wins 13 of 16 shared benchmarks. Leads in agentic · knowledge · math.

Category leads
agentic·Kimi K2.5knowledge·Kimi K2.5math·Kimi K2.5language·Kimi K2.5coding·GLM 4.7reasoning·GLM 4.7
Hype vs Reality
Kimi K2.5
#85 by perf·no signal
QUIET
GLM 4.7
#91 by perf·no signal
QUIET
Best value
1.0x better value than GLM 4.7
Kimi K2.5
49.5 pts/$
$1.05/M
GLM 4.7
47.2 pts/$
$1.07/M
Vendor risk
moonshotai logo
moonshotai
private · undisclosed
Unknown
z-ai logo
z-ai
private · undisclosed
Unknown
Head to head
Kimi K2.5GLM 4.7
APEX-Agents
Kimi K2.5 leads by +11.3
APEX-Agents · evaluates AI agents on complex, multi-step tasks requiring planning, tool use, and autonomous decision-making in realistic environments.
Kimi K2.5
14.4
GLM 4.7
3.1
Chess Puzzles
Kimi K2.5 leads by +6.0
Chess Puzzles · tests strategic and tactical reasoning by having models solve chess puzzle positions, evaluating lookahead and pattern recognition abilities.
Kimi K2.5
12.0
GLM 4.7
6.0
FrontierMath-2025-02-28-Private
Kimi K2.5 leads by +25.5
FrontierMath (Feb 2025) · original research-level math problems created by mathematicians, testing capabilities at the boundary of current AI mathematical reasoning.
Kimi K2.5
27.9
GLM 4.7
2.4
FrontierMath-Tier-4-2025-07-01-Private
Kimi K2.5 leads by +4.1
FrontierMath Tier 4 (Jul 2025) · the most challenging tier of frontier mathematics, containing problems that push the absolute limits of AI mathematical reasoning.
Kimi K2.5
4.2
GLM 4.7
0.1
GPQA diamond
Kimi K2.5 leads by +5.7
Graduate-Level Google-Proof QA (Diamond set) · expert-crafted questions in physics, biology, and chemistry that are difficult even for domain PhDs.
Kimi K2.5
83.5
GLM 4.7
77.8
OpenCompass · AIME2025
GLM 4.7 leads by +3.5
Kimi K2.5
91.9
GLM 4.7
95.4
OpenCompass · GPQA-Diamond
Kimi K2.5 leads by +1.2
Kimi K2.5
88.1
GLM 4.7
86.9
OpenCompass · HLE
Kimi K2.5 leads by +3.2
Kimi K2.5
28.6
GLM 4.7
25.4
OpenCompass · IFEval
Kimi K2.5 leads by +3.7
Kimi K2.5
93.9
GLM 4.7
90.2
OpenCompass · LiveCodeBenchV6
GLM 4.7 leads by +3.2
Kimi K2.5
80.6
GLM 4.7
83.8
OpenCompass · MMLU-Pro
Kimi K2.5 leads by +2.2
Kimi K2.5
86.2
GLM 4.7
84.0
OTIS Mock AIME 2024-2025
Kimi K2.5 leads by +8.9
OTIS Mock AIME 2024–2025 · simulated American Invitational Mathematics Examination problems testing advanced problem-solving skills.
Kimi K2.5
92.2
GLM 4.7
83.3
PostTrainBench
Kimi K2.5 leads by +2.8
Kimi K2.5
10.3
GLM 4.7
7.5
SimpleBench
GLM 4.7 leads by +1.1
SimpleBench · tests fundamental reasoning capabilities with straightforward problems designed to expose gaps in basic logical and spatial thinking.
Kimi K2.5
36.2
GLM 4.7
37.2
SimpleQA Verified
Kimi K2.5 leads by +2.4
SimpleQA Verified · short factual questions with verified answers, measuring factual accuracy and the tendency to hallucinate or provide incorrect information.
Kimi K2.5
33.9
GLM 4.7
31.5
Terminal Bench
Kimi K2.5 leads by +9.8
Terminal Bench · tests the ability to accomplish real-world tasks using terminal commands, evaluating shell scripting and CLI tool proficiency.
Kimi K2.5
43.2
GLM 4.7
33.4
Full benchmark table
BenchmarkKimi K2.5GLM 4.7
APEX-Agents
APEX-Agents · evaluates AI agents on complex, multi-step tasks requiring planning, tool use, and autonomous decision-making in realistic environments.
14.43.1
Chess Puzzles
Chess Puzzles · tests strategic and tactical reasoning by having models solve chess puzzle positions, evaluating lookahead and pattern recognition abilities.
12.06.0
FrontierMath-2025-02-28-Private
FrontierMath (Feb 2025) · original research-level math problems created by mathematicians, testing capabilities at the boundary of current AI mathematical reasoning.
27.92.4
FrontierMath-Tier-4-2025-07-01-Private
FrontierMath Tier 4 (Jul 2025) · the most challenging tier of frontier mathematics, containing problems that push the absolute limits of AI mathematical reasoning.
4.20.1
GPQA diamond
Graduate-Level Google-Proof QA (Diamond set) · expert-crafted questions in physics, biology, and chemistry that are difficult even for domain PhDs.
83.577.8
OpenCompass · AIME2025
91.995.4
OpenCompass · GPQA-Diamond
88.186.9
OpenCompass · HLE
28.625.4
OpenCompass · IFEval
93.990.2
OpenCompass · LiveCodeBenchV6
80.683.8
OpenCompass · MMLU-Pro
86.284.0
OTIS Mock AIME 2024-2025
OTIS Mock AIME 2024–2025 · simulated American Invitational Mathematics Examination problems testing advanced problem-solving skills.
92.283.3
PostTrainBench
10.37.5
SimpleBench
SimpleBench · tests fundamental reasoning capabilities with straightforward problems designed to expose gaps in basic logical and spatial thinking.
36.237.2
SimpleQA Verified
SimpleQA Verified · short factual questions with verified answers, measuring factual accuracy and the tendency to hallucinate or provide incorrect information.
33.931.5
Terminal Bench
Terminal Bench · tests the ability to accomplish real-world tasks using terminal commands, evaluating shell scripting and CLI tool proficiency.
43.233.4
Pricing · per 1M tokens · projected $/mo at 10M tokens
ModelInputOutputContextProjected $/mo
moonshotai logoKimi K2.5$0.38$1.72262K tokens (~131 books)$7.17
z-ai logoGLM 4.7$0.39$1.75203K tokens (~101 books)$7.30