GLM-5

Overview

GLM-5 is Zhipu AI's new-generation flagship foundation model, specifically designed for Coding and Agent scenarios. It achieves State-Of-The-Art (SOTA) performance in open-source complex system engineering and long-horizon tasks, with a real-world coding experience approaching Claude Opus level.

Based on a 744B scale foundation model, combined with asynchronous reinforcement learning and sparse attention mechanisms, GLM-5 marks a paradigm shift from "writing code" to "building systems".

Key Features

Parameter Scale and Data Volume: The base model's parameter scale has expanded to 744B (with 40B activated parameters), and pre-training data has increased to 28.5T, significantly enhancing the model's breadth and depth of knowledge.
Ultra-Long Context and Output: Supports a context window of up to 200K tokens and a maximum output length of 128K tokens, enabling excellent performance in handling complex code repositories and multi-step tasks.
Exceptional Coding & Agent Capabilities: Systematically strengthened programming capabilities, excelling in code generation with low hallucination rates and efficient token utilization.
Multiple Thinking Modes: Offers various thinking modes to support more flexible and in-depth problem-solving.

Best Use Cases

Complex System Engineering: Construction and management of complex software systems, assisting in system design and optimization.
Long-Horizon Agent Tasks: Agent tasks requiring multi-step planning, execution, and feedback (e.g., automated workflows).
High-Precision Code Debugging: Provides human-level coding assistance to improve development efficiency.
Large-Scale Document Analysis: Deep information extraction and summarization for massive document sets.

Capabilities and Limitations

Capability	Detailed Description
Reasoning Ability	Extremely Strong. Excels in complex logical reasoning and multi-step planning.
Creative Ability	Extremely Strong. Particularly adept at code generation and system design.
Multimodal Ability	Primarily focuses on text/code; can be integrated with visual tools on Zhipu platform.
Response Speed	30-50 tokens/s. Balances high-quality output with efficient speed.
Context Window	200K Tokens
Max Output	128K Tokens

Credits Usage

Model	Input (Credits/Token)	Cache Write (Credits/Token)	Cache Read (Credits/Token)	Output (Credits/Token)	Web Search (Credits/Use)	Billing Notes
GLM-5	`1.00`	`1.00`	`0.20`	`3.20`	`-`	-

Pricing note

Prices shown in the documentation are B.AI standard reference prices for base billing purposes. B.AI may provide lower actual usage costs through top-up bonuses and account benefits. Specific prices, bonus Credits, and account benefits are subject to the platform display and final billing records.

GLM-5

Overview​

Key Features​

Best Use Cases​

Capabilities and Limitations​

Credits Usage​

Overview

Key Features

Best Use Cases

Capabilities and Limitations

Credits Usage