GLM-5: Zhipu AI's 744B Open-Source Model Trained Entirely on Chinese Chips
Zhipu AI releases GLM-5 under MIT license, a 744B MoE model trained on Huawei Ascend chips that rivals Claude Opus 4.5 on coding benchmarks.
Zhipu AI releases GLM-5 under MIT license, a 744B MoE model trained on Huawei Ascend chips that rivals Claude Opus 4.5 on coding benchmarks.
A Milestone in Hardware-Independent AI Development
On February 12, 2026, Chinese AI lab Zhipu AI released GLM-5, its fifth-generation large language model featuring approximately 744 billion total parameters in a Mixture of Experts (MoE) architecture. What makes this release particularly significant is not just the model's competitive benchmark performance, but the fact that it was trained entirely on Huawei Ascend chips using the MindSpore framework, achieving full independence from US-manufactured semiconductor hardware.
The model weights are available under the MIT license, one of the most permissive open-source licenses available, signaling Zhipu AI's commitment to open research and broad adoption.
Architecture and Training Details
Mixture of Experts at Scale
GLM-5 employs a MoE architecture with 744 billion total parameters, of which approximately 40 billion are active during inference. This is nearly double the size of its predecessor GLM-4.5, which had 355 billion total parameters. The MoE approach allows the model to maintain high performance while keeping inference costs manageable, as only a fraction of the total parameters are activated for any given query.
DeepSeek Sparse Attention
The model incorporates DeepSeek Sparse Attention (DSA), a technique that reduces deployment costs without sacrificing long-context performance. This architectural choice reflects a broader trend in the industry toward efficiency-focused design, where raw parameter count matters less than how those parameters are utilized.
Training Data and Hardware
GLM-5 was trained on 28.5 trillion tokens, up from 23 trillion in GLM-4.5. The entire training process was conducted on Huawei Ascend chips, with additional support for Moore Threads and Cambricon hardware. This represents a significant achievement in demonstrating that world-class AI models can be developed without relying on NVIDIA GPUs, a point of strategic importance given ongoing US export restrictions on advanced AI chips to China.
Benchmark Performance
GLM-5 demonstrates competitive results across several industry-standard benchmarks, though the picture is nuanced.
Coding and Agent Tasks
On SWE-bench Verified, the standard benchmark for real-world software engineering tasks, GLM-5 scores 77.8 percent. This beats DeepSeek-V3.2 and Kimi K2.5 but still trails Claude Opus 4.5 at 80.9 percent. The gap of roughly 3 percentage points is notable but narrow enough to be within striking distance.
In Vending Bench 2, which simulates a year of business operations, GLM-5 achieved a balance of $4,432 compared to Claude Opus 4.5's $4,967. On BrowseComp, Zhipu claims GLM-5 surpasses all tested proprietary models for agent-based web search tasks.
Reasoning and General Intelligence
On Humanity's Last Exam (with tools), GLM-5 scores 50.4 points. Terminal-Bench 2.0 results show 56.2 percent on the standard version and 60.7 percent on the verified version. These results place GLM-5 among the top tier of open-source models, though direct comparisons with proprietary models like GPT-5.2 remain contested.
Key Features Beyond Raw Performance
Document Generation
GLM-5 includes built-in document generation capabilities, allowing users to create Word (.docx), PDF, and Excel (.xlsx) files directly from text prompts. This practical feature transforms the model from a pure language processor into a productivity tool.
Agent Mode
The model supports an agent mode with built-in skills for document creation and web browsing tasks. It is compatible with popular coding agent frameworks including Claude Code, OpenCode, and Roo Code through the OpenClaw framework.
Inference Framework Support
GLM-5 supports both vLLM and SGLang inference frameworks, making deployment straightforward for teams already using these popular serving solutions. Model weights are available on Hugging Face, and API access is provided through Zhipu's Z.ai platform.
Strategic Significance
Hardware Independence
The most consequential aspect of GLM-5 is not its benchmark scores but its proof that competitive large language models can be trained without access to NVIDIA hardware. As US export controls continue to restrict the sale of advanced AI chips to China, the ability to train on domestic hardware like Huawei Ascend becomes a strategic imperative. GLM-5 demonstrates that this is technically feasible at a world-class level.
Open Source Under MIT License
By releasing under the MIT license rather than a more restrictive custom license, Zhipu AI enables unrestricted commercial use, modification, and redistribution. This positions GLM-5 as one of the most permissively licensed large-scale MoE models available, potentially accelerating adoption in both research and commercial applications.
Market Impact
Zhipu AI's stock surged approximately 26 percent following the GLM-5 announcement, reflecting market confidence in the model's competitive positioning. The release intensifies competition in the open-source LLM space, where Meta's Llama and Mistral AI have been dominant players.
Limitations and Considerations
While GLM-5's benchmarks are impressive, several caveats deserve attention. Many benchmark comparisons are self-reported by Zhipu AI and await independent verification. The model's performance on non-English tasks, particularly outside Chinese and English, has not been extensively documented. Additionally, the 40B active parameter count, while efficient, may limit performance on certain tasks compared to dense models of similar total size.
Who Should Pay Attention
GLM-5 is particularly relevant for organizations exploring open-source alternatives to proprietary models, developers building agent-based applications who need strong coding performance, researchers interested in MoE architectures and efficient training approaches, and companies in regions where hardware supply chain independence is strategically important.
The model represents a meaningful step forward in demonstrating that the open-source AI ecosystem can produce models competitive with the best proprietary offerings, even when trained on non-standard hardware.
Pros
- MIT license enables unrestricted commercial use and modification
- Competitive coding benchmarks rivaling top proprietary models
- Hardware independence from NVIDIA GPUs is strategically significant
- Built-in document generation and agent mode add practical value
- Strong open-source ecosystem support with Hugging Face and popular frameworks
Cons
- Benchmark claims are largely self-reported and await independent verification
- Still trails Claude Opus 4.5 on key benchmarks like SWE-bench Verified
- Limited documentation on non-English and non-Chinese language performance
- 40B active parameters may limit performance on certain dense-model-favored tasks
References
Comments0
Key Features
Zhipu AI released GLM-5 on February 12, 2026, a 744B parameter MoE model with 40B active parameters, trained entirely on Huawei Ascend chips. It scores 77.8% on SWE-bench Verified (close to Claude Opus 4.5's 80.9%), uses DeepSeek Sparse Attention for efficiency, was trained on 28.5 trillion tokens, and is released under the permissive MIT license with weights on Hugging Face.
Key Insights
- 744B total parameters with only 40B active at inference, achieving strong efficiency through MoE architecture
- Trained entirely on Huawei Ascend chips, proving world-class AI development is possible without NVIDIA hardware
- SWE-bench Verified score of 77.8% places it within 3 percentage points of Claude Opus 4.5
- MIT license makes it one of the most permissively licensed large-scale MoE models available
- DeepSeek Sparse Attention reduces deployment costs while maintaining long-context performance
- 28.5 trillion training tokens, up from 23 trillion in the predecessor GLM-4.5
- Stock surged 26% post-announcement, reflecting strong market confidence
- Compatible with major inference frameworks (vLLM, SGLang) and coding agents (Claude Code, OpenCode)
Was this review helpful?
Share
Related AI Reviews
DeepSeek R2 Review: 32B Open-Weight Model Hits 92.7% on AIME at 70% Lower Cost
DeepSeek releases R2, a 32B dense transformer reasoning model that achieves frontier-level math scores on a single consumer GPU, priced 70% below Western alternatives.
GLM-5.1 Review: Z.ai's 754B Open-Source Model Claims #1 on SWE-Bench Pro
Z.ai released GLM-5.1 on April 8, 2026 — a 754B open-weight MoE model that tops SWE-Bench Pro with a score of 58.4, surpassing GPT-5.4 and Claude Opus 4.6, and sustains 8-hour autonomous task execution.
Meta Muse Spark Review: Superintelligence Labs' First Closed Proprietary Model
Meta's Muse Spark launches as a natively multimodal reasoning model from its new Superintelligence Labs, marking a strategic pivot from open-source to proprietary AI development.
Arcee Trinity-Large-Thinking: 399B Open-Source Reasoning Model at 96% Lower Cost
A 26-person U.S. startup released a 399B Apache 2.0 reasoning model that ranks #2 on PinchBench and costs 96% less than Claude Opus 4.6.
