GLM-5: Zhipu AI's 744B Open-Source Model Trained Entirely on Chinese Chips
Zhipu AI releases GLM-5 under MIT license, a 744B MoE model trained on Huawei Ascend chips that rivals Claude Opus 4.5 on coding benchmarks.
Zhipu AI releases GLM-5 under MIT license, a 744B MoE model trained on Huawei Ascend chips that rivals Claude Opus 4.5 on coding benchmarks.
A Milestone in Hardware-Independent AI Development
On February 12, 2026, Chinese AI lab Zhipu AI released GLM-5, its fifth-generation large language model featuring approximately 744 billion total parameters in a Mixture of Experts (MoE) architecture. What makes this release particularly significant is not just the model's competitive benchmark performance, but the fact that it was trained entirely on Huawei Ascend chips using the MindSpore framework, achieving full independence from US-manufactured semiconductor hardware.
The model weights are available under the MIT license, one of the most permissive open-source licenses available, signaling Zhipu AI's commitment to open research and broad adoption.
Architecture and Training Details
Mixture of Experts at Scale
GLM-5 employs a MoE architecture with 744 billion total parameters, of which approximately 40 billion are active during inference. This is nearly double the size of its predecessor GLM-4.5, which had 355 billion total parameters. The MoE approach allows the model to maintain high performance while keeping inference costs manageable, as only a fraction of the total parameters are activated for any given query.
DeepSeek Sparse Attention
The model incorporates DeepSeek Sparse Attention (DSA), a technique that reduces deployment costs without sacrificing long-context performance. This architectural choice reflects a broader trend in the industry toward efficiency-focused design, where raw parameter count matters less than how those parameters are utilized.
Training Data and Hardware
GLM-5 was trained on 28.5 trillion tokens, up from 23 trillion in GLM-4.5. The entire training process was conducted on Huawei Ascend chips, with additional support for Moore Threads and Cambricon hardware. This represents a significant achievement in demonstrating that world-class AI models can be developed without relying on NVIDIA GPUs, a point of strategic importance given ongoing US export restrictions on advanced AI chips to China.
Benchmark Performance
GLM-5 demonstrates competitive results across several industry-standard benchmarks, though the picture is nuanced.
Coding and Agent Tasks
On SWE-bench Verified, the standard benchmark for real-world software engineering tasks, GLM-5 scores 77.8 percent. This beats DeepSeek-V3.2 and Kimi K2.5 but still trails Claude Opus 4.5 at 80.9 percent. The gap of roughly 3 percentage points is notable but narrow enough to be within striking distance.
In Vending Bench 2, which simulates a year of business operations, GLM-5 achieved a balance of $4,432 compared to Claude Opus 4.5's $4,967. On BrowseComp, Zhipu claims GLM-5 surpasses all tested proprietary models for agent-based web search tasks.
Reasoning and General Intelligence
On Humanity's Last Exam (with tools), GLM-5 scores 50.4 points. Terminal-Bench 2.0 results show 56.2 percent on the standard version and 60.7 percent on the verified version. These results place GLM-5 among the top tier of open-source models, though direct comparisons with proprietary models like GPT-5.2 remain contested.
Key Features Beyond Raw Performance
Document Generation
GLM-5 includes built-in document generation capabilities, allowing users to create Word (.docx), PDF, and Excel (.xlsx) files directly from text prompts. This practical feature transforms the model from a pure language processor into a productivity tool.
Agent Mode
The model supports an agent mode with built-in skills for document creation and web browsing tasks. It is compatible with popular coding agent frameworks including Claude Code, OpenCode, and Roo Code through the OpenClaw framework.
Inference Framework Support
GLM-5 supports both vLLM and SGLang inference frameworks, making deployment straightforward for teams already using these popular serving solutions. Model weights are available on Hugging Face, and API access is provided through Zhipu's Z.ai platform.
Strategic Significance
Hardware Independence
The most consequential aspect of GLM-5 is not its benchmark scores but its proof that competitive large language models can be trained without access to NVIDIA hardware. As US export controls continue to restrict the sale of advanced AI chips to China, the ability to train on domestic hardware like Huawei Ascend becomes a strategic imperative. GLM-5 demonstrates that this is technically feasible at a world-class level.
Open Source Under MIT License
By releasing under the MIT license rather than a more restrictive custom license, Zhipu AI enables unrestricted commercial use, modification, and redistribution. This positions GLM-5 as one of the most permissively licensed large-scale MoE models available, potentially accelerating adoption in both research and commercial applications.
Market Impact
Zhipu AI's stock surged approximately 26 percent following the GLM-5 announcement, reflecting market confidence in the model's competitive positioning. The release intensifies competition in the open-source LLM space, where Meta's Llama and Mistral AI have been dominant players.
Limitations and Considerations
While GLM-5's benchmarks are impressive, several caveats deserve attention. Many benchmark comparisons are self-reported by Zhipu AI and await independent verification. The model's performance on non-English tasks, particularly outside Chinese and English, has not been extensively documented. Additionally, the 40B active parameter count, while efficient, may limit performance on certain tasks compared to dense models of similar total size.
Who Should Pay Attention
GLM-5 is particularly relevant for organizations exploring open-source alternatives to proprietary models, developers building agent-based applications who need strong coding performance, researchers interested in MoE architectures and efficient training approaches, and companies in regions where hardware supply chain independence is strategically important.
The model represents a meaningful step forward in demonstrating that the open-source AI ecosystem can produce models competitive with the best proprietary offerings, even when trained on non-standard hardware.
Editor's Verdict
GLM-5: Zhipu AI's 744B Open-Source Model Trained Entirely on Chinese Chips earns a solid recommendation within the other llm space.
The strongest case for paying attention is MIT license enables unrestricted commercial use and modification, which raises the bar for what readers should now expect from peers in this space. Reinforcing that, competitive coding benchmarks rivaling top proprietary models adds practical value rather than just headline appeal. The broader signal worth registering is straightforward: 744B total parameters with only 40B active at inference, achieving strong efficiency through MoE architecture. On the other side of the ledger, benchmark claims are largely self-reported and await independent verification is a real constraint, not a marketing footnote, and it should factor into any serious decision. Layered on top of that, still trails Claude Opus 4.5 on key benchmarks like SWE-bench Verified narrows the set of teams for whom this is an obvious yes.
For multi-model deployment teams, cost-conscious operators, and developers willing to evaluate beyond the major labs, this is a serious evaluation candidate, not just a curiosity to bookmark. For everyone else, the safer posture is to monitor coverage and revisit once the use cases that matter to your team are demonstrated in the wild.
Pros
- MIT license enables unrestricted commercial use and modification
- Competitive coding benchmarks rivaling top proprietary models
- Hardware independence from NVIDIA GPUs is strategically significant
- Built-in document generation and agent mode add practical value
- Strong open-source ecosystem support with Hugging Face and popular frameworks
Cons
- Benchmark claims are largely self-reported and await independent verification
- Still trails Claude Opus 4.5 on key benchmarks like SWE-bench Verified
- Limited documentation on non-English and non-Chinese language performance
- 40B active parameters may limit performance on certain dense-model-favored tasks
References
Comments0
Key Features
Zhipu AI released GLM-5 on February 12, 2026, a 744B parameter MoE model with 40B active parameters, trained entirely on Huawei Ascend chips. It scores 77.8% on SWE-bench Verified (close to Claude Opus 4.5's 80.9%), uses DeepSeek Sparse Attention for efficiency, was trained on 28.5 trillion tokens, and is released under the permissive MIT license with weights on Hugging Face.
Key Insights
- 744B total parameters with only 40B active at inference, achieving strong efficiency through MoE architecture
- Trained entirely on Huawei Ascend chips, proving world-class AI development is possible without NVIDIA hardware
- SWE-bench Verified score of 77.8% places it within 3 percentage points of Claude Opus 4.5
- MIT license makes it one of the most permissively licensed large-scale MoE models available
- DeepSeek Sparse Attention reduces deployment costs while maintaining long-context performance
- 28.5 trillion training tokens, up from 23 trillion in the predecessor GLM-4.5
- Stock surged 26% post-announcement, reflecting strong market confidence
- Compatible with major inference frameworks (vLLM, SGLang) and coding agents (Claude Code, OpenCode)
Was this review helpful?
Share
Related AI Reviews
xAI Grok Build 0.1: Terminal-Native Coding Agent Enters Public Beta with Parallel Subagents
xAI released Grok Build 0.1 to public beta on May 28, 2026, a terminal-native coding model with 256K context, parallel subagents, plan mode, and $1/M token pricing to compete with Claude Code.
DeepSeek Makes V4-Pro Price Cut Permanent: 75% Off, Reshaping Frontier AI Economics
DeepSeek officially made its 75% price reduction on V4-Pro permanent on May 22, 2026, pricing output at $0.87/MTok versus rivals charging 30-34x more for comparable performance.
SubQ Launches: The First Subquadratic LLM With a 12 Million Token Context Window
Subquadratic debuted SubQ on May 5, 2026 with $29M seed funding, claiming a 12M-token context window and up to 1,000x lower compute cost than frontier transformer models.
Alibaba Qwen3.7-Max Review: 35-Hour Autonomous Agent, 80.4% SWE Score
Alibaba's Qwen3.7-Max redefines the frontier of agentic AI with a 1M-token context, 80.4% SWE-Verified coding score, and a verified 35-hour continuous autonomous coding run firing 1,158 tool calls.
