LIVETracking 994 AI models from 267 providers.

Gecko Tests·powered by GeckoBench · AI Bias, Censorship, IQ & Politics View Gecko Tests Build your own chart

Home/Models/GLM 4.6

GLM 4.6

by z-ai · Released Sep 2025

Open Source

54.4

avg score

Rank #91

Better than 61% of all models

Context

205K tokens (~102 books)

Input $/1M

$0.39

Output $/1M

$1.90

Type

text

License

Open Source

Benchmarks

20 tested

Data updated today

About

Compared with GLM-4.5, this generation brings several key improvements: Longer context window: The context window has been expanded from 128K to 200K tokens, enabling the model to handle more complex...

Tested on 20 benchmarks with 50.8% average. Top scores: Chatbot Arena Elo — Overall (1425.8%), Chatbot Arena Elo — Coding (1353.7%), OpenCompass — AIME2025 (90.3%).

Looking for similar performance at lower cost?
Phi 4 scores 54.2 (100% as good) at $0.07/1M input · 83% cheaper

Capabilities

coding

52.2

#61 globally

reasoning

57.0

#38 globally

math

44.3

#95 globally

knowledge

59.5

#51 globally

agentic

3.0

#37 globally

language

58.0

#91 globally

Benchmark Scores

Tested on 20 benchmarks · Ranked across 7 categories

Score Distribution (all 233 models)

0255075100

▲ You are here

codingCompare coding →

OpenCompass — LiveCodeBenchV6

OpenCompass Live Code Bench v6. Fresh competitive programming problems to evaluate code generation without memorization.

78.2—

LiveBench — Coding

Regularly refreshed coding problems that avoid data contamination. New problems added monthly to prevent memorization.

71.0—

LiveBench — Agentic Coding

LiveBench coding tasks that require multi-step reasoning and tool use. Tests planning and execution of complex coding workflows.

35.0—

reasoningCompare reasoning →

LiveBench — Reasoning

Regularly refreshed reasoning problems testing logical deduction, spatial reasoning, and analytical thinking.

62.1—

LiveBench — Data Analysis

Fresh data analysis tasks testing ability to interpret tables, charts, and statistical data.

52.0—

mathCompare math →

OpenCompass — AIME2025

OpenCompass evaluation on AIME 2025 problems. Tests mathematical reasoning on fresh competition problems.

90.3—

LiveBench — Mathematics

Regularly updated math problems that test numerical reasoning, algebra, calculus, and combinatorics.

81.1—

FrontierMath-2025-02-28-Private

Original research-level math problems created by professional mathematicians. Problems are unpublished and cannot be memorized.

3.8—

Quick compare:

vs Llama 3.1 70B Instruct

vs DeepSeek V3 0324

Excellent (85+) Good (70-85) Average (50-70) Below (<50)

Similar Models

Llama 3.1 70B Instruct

DeepSeek V3 0324

Links

Info

z-ai Pricing explorer Developers · API

Research

Technical Report

Documentation

API Docs Playground

Community

Source Code

GitHub Hugging Face

BenchGecko API

glm-4-6

Specifications

Typetext
Context205K tokens (~102 books)
ReleasedSep 2025
LicenseOpen Source
StatusActive
Cost / Message~$0.003

Available On

z-ai$0.39

Categories

coding reasoning math knowledge agentic language

Learn More

context-window transformer open weights tokens

Share & Export

Related Models

Llama 3.1 70B Instruct

DeepSeek V3 0324

Frequently Asked Questions

GLM 4.6 is an open-source text AI model by z-ai, released in September 2025. It has an average benchmark score of 54.4. Context window: 205K tokens.

Related Models

Phi 4 · Microsoft Llama 3.1 70B Instruct · Meta DeepSeek V3 0324 · DeepSeek DeepSeek V3.1 · DeepSeek LongCat Flash Chat · meituan

Benchmarks

Chatbot Arena Elo — Overall Chatbot Arena Elo — Coding OpenCompass — AIME2025 OpenCompass — IFEval OpenCompass — MMLU-Pro

Related Pages

z-ai · Provider z-ai · Economy All Models Compare Models Pricing Developers · API Context Window · Glossary