LongCat-Flash-Chat is a large-scale Mixture-of-Experts (MoE) model with 560B total parameters, of which 18.6B–31.3B (≈27B on average) are dynamically activated per input. It introduces a shortcut-connected MoE design to reduce...
Tested on 7 benchmarks with 59.8% average. Top scores: Chatbot Arena Elo — Overall (1401.1%), OpenCompass — IFEval (90.2%), OpenCompass — MMLU-Pro (81.0%).
Phi 4 scores 54.2 (101% as good) at $0.07/1M input · 68% cheaper
OpenCompass Live Code Bench v6. Fresh competitive programming problems to evaluate code generation without memorization.
OpenCompass evaluation on AIME 2025 problems. Tests mathematical reasoning on fresh competition problems.
OpenCompass MMLU-Pro evaluation. Harder knowledge test with more answer choices.
OpenCompass evaluation of GPQA Diamond. PhD-level science questions from the hardest subset.
OpenCompass evaluation of Humanitys Last Exam. Expert-level cross-discipline knowledge test.
- Typetext
- Context131K tokens (~66 books)
- ReleasedSep 2025
- LicenseOpen Source
- StatusActive
- Cost / Message~$0.001