Back to list
Feb 15, 2026
21
0
0
Other LLM

GLM-5: Zhipu AI's 744B Open-Source Model Trained Entirely on Chinese Chips

Zhipu AI releases GLM-5 under MIT license, a 744B MoE model trained on Huawei Ascend chips that rivals Claude Opus 4.5 on coding benchmarks.

#Zhipu AI#GLM-5#Open Source#MoE#Huawei Ascend
GLM-5: Zhipu AI's 744B Open-Source Model Trained Entirely on Chinese Chips
AI Summary

Zhipu AI releases GLM-5 under MIT license, a 744B MoE model trained on Huawei Ascend chips that rivals Claude Opus 4.5 on coding benchmarks.

A Milestone in Hardware-Independent AI Development

On February 12, 2026, Chinese AI lab Zhipu AI released GLM-5, its fifth-generation large language model featuring approximately 744 billion total parameters in a Mixture of Experts (MoE) architecture. What makes this release particularly significant is not just the model's competitive benchmark performance, but the fact that it was trained entirely on Huawei Ascend chips using the MindSpore framework, achieving full independence from US-manufactured semiconductor hardware.

The model weights are available under the MIT license, one of the most permissive open-source licenses available, signaling Zhipu AI's commitment to open research and broad adoption.

Architecture and Training Details

Mixture of Experts at Scale

GLM-5 employs a MoE architecture with 744 billion total parameters, of which approximately 40 billion are active during inference. This is nearly double the size of its predecessor GLM-4.5, which had 355 billion total parameters. The MoE approach allows the model to maintain high performance while keeping inference costs manageable, as only a fraction of the total parameters are activated for any given query.

DeepSeek Sparse Attention

The model incorporates DeepSeek Sparse Attention (DSA), a technique that reduces deployment costs without sacrificing long-context performance. This architectural choice reflects a broader trend in the industry toward efficiency-focused design, where raw parameter count matters less than how those parameters are utilized.

Training Data and Hardware

GLM-5 was trained on 28.5 trillion tokens, up from 23 trillion in GLM-4.5. The entire training process was conducted on Huawei Ascend chips, with additional support for Moore Threads and Cambricon hardware. This represents a significant achievement in demonstrating that world-class AI models can be developed without relying on NVIDIA GPUs, a point of strategic importance given ongoing US export restrictions on advanced AI chips to China.

Benchmark Performance

GLM-5 demonstrates competitive results across several industry-standard benchmarks, though the picture is nuanced.

Coding and Agent Tasks

On SWE-bench Verified, the standard benchmark for real-world software engineering tasks, GLM-5 scores 77.8 percent. This beats DeepSeek-V3.2 and Kimi K2.5 but still trails Claude Opus 4.5 at 80.9 percent. The gap of roughly 3 percentage points is notable but narrow enough to be within striking distance.

In Vending Bench 2, which simulates a year of business operations, GLM-5 achieved a balance of $4,432 compared to Claude Opus 4.5's $4,967. On BrowseComp, Zhipu claims GLM-5 surpasses all tested proprietary models for agent-based web search tasks.

Reasoning and General Intelligence

On Humanity's Last Exam (with tools), GLM-5 scores 50.4 points. Terminal-Bench 2.0 results show 56.2 percent on the standard version and 60.7 percent on the verified version. These results place GLM-5 among the top tier of open-source models, though direct comparisons with proprietary models like GPT-5.2 remain contested.

Key Features Beyond Raw Performance

Document Generation

GLM-5 includes built-in document generation capabilities, allowing users to create Word (.docx), PDF, and Excel (.xlsx) files directly from text prompts. This practical feature transforms the model from a pure language processor into a productivity tool.

Agent Mode

The model supports an agent mode with built-in skills for document creation and web browsing tasks. It is compatible with popular coding agent frameworks including Claude Code, OpenCode, and Roo Code through the OpenClaw framework.

Inference Framework Support

GLM-5 supports both vLLM and SGLang inference frameworks, making deployment straightforward for teams already using these popular serving solutions. Model weights are available on Hugging Face, and API access is provided through Zhipu's Z.ai platform.

Strategic Significance

Hardware Independence

The most consequential aspect of GLM-5 is not its benchmark scores but its proof that competitive large language models can be trained without access to NVIDIA hardware. As US export controls continue to restrict the sale of advanced AI chips to China, the ability to train on domestic hardware like Huawei Ascend becomes a strategic imperative. GLM-5 demonstrates that this is technically feasible at a world-class level.

Open Source Under MIT License

By releasing under the MIT license rather than a more restrictive custom license, Zhipu AI enables unrestricted commercial use, modification, and redistribution. This positions GLM-5 as one of the most permissively licensed large-scale MoE models available, potentially accelerating adoption in both research and commercial applications.

Market Impact

Zhipu AI's stock surged approximately 26 percent following the GLM-5 announcement, reflecting market confidence in the model's competitive positioning. The release intensifies competition in the open-source LLM space, where Meta's Llama and Mistral AI have been dominant players.

Limitations and Considerations

While GLM-5's benchmarks are impressive, several caveats deserve attention. Many benchmark comparisons are self-reported by Zhipu AI and await independent verification. The model's performance on non-English tasks, particularly outside Chinese and English, has not been extensively documented. Additionally, the 40B active parameter count, while efficient, may limit performance on certain tasks compared to dense models of similar total size.

Who Should Pay Attention

GLM-5 is particularly relevant for organizations exploring open-source alternatives to proprietary models, developers building agent-based applications who need strong coding performance, researchers interested in MoE architectures and efficient training approaches, and companies in regions where hardware supply chain independence is strategically important.

The model represents a meaningful step forward in demonstrating that the open-source AI ecosystem can produce models competitive with the best proprietary offerings, even when trained on non-standard hardware.

Pros

  • MIT license enables unrestricted commercial use and modification
  • Competitive coding benchmarks rivaling top proprietary models
  • Hardware independence from NVIDIA GPUs is strategically significant
  • Built-in document generation and agent mode add practical value
  • Strong open-source ecosystem support with Hugging Face and popular frameworks

Cons

  • Benchmark claims are largely self-reported and await independent verification
  • Still trails Claude Opus 4.5 on key benchmarks like SWE-bench Verified
  • Limited documentation on non-English and non-Chinese language performance
  • 40B active parameters may limit performance on certain dense-model-favored tasks

Comments0

Key Features

Zhipu AI released GLM-5 on February 12, 2026, a 744B parameter MoE model with 40B active parameters, trained entirely on Huawei Ascend chips. It scores 77.8% on SWE-bench Verified (close to Claude Opus 4.5's 80.9%), uses DeepSeek Sparse Attention for efficiency, was trained on 28.5 trillion tokens, and is released under the permissive MIT license with weights on Hugging Face.

Key Insights

  • 744B total parameters with only 40B active at inference, achieving strong efficiency through MoE architecture
  • Trained entirely on Huawei Ascend chips, proving world-class AI development is possible without NVIDIA hardware
  • SWE-bench Verified score of 77.8% places it within 3 percentage points of Claude Opus 4.5
  • MIT license makes it one of the most permissively licensed large-scale MoE models available
  • DeepSeek Sparse Attention reduces deployment costs while maintaining long-context performance
  • 28.5 trillion training tokens, up from 23 trillion in the predecessor GLM-4.5
  • Stock surged 26% post-announcement, reflecting strong market confidence
  • Compatible with major inference frameworks (vLLM, SGLang) and coding agents (Claude Code, OpenCode)

Was this review helpful?

Share

Twitter/X