测试版
排行榜/Kimi K2 Thinking
moonshotai logo

Kimi K2 Thinking

开源

来自 moonshotai · 发布于 2025-11-06

53.3
平均分
$0.60/1M
输入价格
$2.50/1M
输出价格
262K tokens (~131 books)
上下文窗口
text
类型

Tested on 25 benchmarks with 53.3% average. Top scores: OpenCompass — AIME2025 (94.1%), OpenCompass — IFEval (92.4%), OpenCompass — MMLU-Pro (84.3%).

基准测试类别分数Bar
OpenCompass — AIME2025math94.1
OpenCompass — IFEvallanguage92.4
OpenCompass — MMLU-Proknowledge84.3
OTIS Mock AIME 2024-2025math83.0
OpenCompass — GPQA-Diamondknowledge82.7
LiveBench — Mathematicsmath81.1
GPQA diamondknowledge79.0
OpenCompass — LiveCodeBenchV6coding77.1
LiveBench — Codingcoding67.4
LiveBench — Languagelanguage66.5
LiveBench — Reasoningreasoning63.5
SWE-Bench Verified (Bash Only)coding63.4
LiveBench — Iflanguage62.0
LiveBench — Overallknowledge61.6
LiveBench — Data Analysisreasoning52.3
WeirdMLcoding42.8
LiveBench — Agentic Codingcoding38.3
Terminal Benchcoding35.7
SimpleQA Verifiedknowledge31.6
FrontierMath-2025-02-28-Privatemath21.4
OpenCompass — HLEknowledge21.3
Chess Puzzlesknowledge20.0
PostTrainBenchknowledge7.3
APEX-Agentsagentic4.0
FrontierMath-Tier-4-2025-07-01-Privatemath0.1