Apr 04, 2026

Open Source

Google Launches Gemma 4: Four Open Models With Agentic Skills Under Apache 2.0

Google DeepMind releases Gemma 4, a family of four open-weight models from 2B to 31B parameters, under Apache 2.0, designed for advanced reasoning and edge deployment.

#Gemma 4#Google DeepMind#Open Source#Apache 2.0#MoE

Google Launches Gemma 4: Four Open Models With Agentic Skills Under Apache 2.0

AI Summary

Google DeepMind releases Gemma 4, a family of four open-weight models from 2B to 31B parameters, under Apache 2.0, designed for advanced reasoning and edge deployment.

The Most Capable Open Models Google Has Ever Released

Google DeepMind has launched Gemma 4, a family of four open-weight models that the company calls its most intelligent open models to date. Released on April 2, 2026, under the permissive Apache 2.0 license, Gemma 4 is purpose-built for advanced reasoning and agentic workflows, delivering what Google describes as an unprecedented level of intelligence-per-parameter.

The release comes at a critical moment in the open-source AI landscape. Chinese labs like DeepSeek and Qwen have been gaining ground with competitive open models, and Meta's Llama series remains the most widely deployed open-weight family. By releasing Gemma 4 under Apache 2.0, a significant upgrade from the restrictive Gemma Use Policy that governed earlier versions, Google is making its strongest bid yet to win over the developer community.

Four Models for Every Scale

Gemma 4 arrives in four distinct sizes, organized into two tiers designed for different deployment scenarios.

The edge tier includes two compact models. Gemma 4 E2B has 2.3 billion effective parameters and is designed for smartphones, embedded devices, and IoT hardware. Gemma 4 E4B has 4.5 billion effective parameters and targets laptops and mid-range devices. Both edge models support text, image, and native audio input with 128K-token context windows, making them capable of speech recognition and visual understanding directly on-device.

The workstation tier includes the larger models. Gemma 4 26B is a Mixture-of-Experts (MoE) architecture with 26 billion total parameters but only 3.8 billion active parameters at any given time, providing strong performance with efficient resource use. Gemma 4 31B is a dense model with 31 billion parameters, the most capable in the family. Both workstation models support text and image input with 256K-token context windows.

Benchmark Performance That Challenges Closed Models

Gemma 4's benchmark results are remarkable for open-weight models. The 31B dense model scores 85.2% on MMLU Pro, 89.2% on AIME 2026, and 80.0% on LiveCodeBench v6, with a Codeforces ELO of 2,150. These numbers place it in direct competition with much larger closed-source models.

The 26B MoE model is equally impressive relative to its active parameter count. It ranks 6th on Arena AI while using only 3.8 billion active parameters, a fraction of what competing models require. This efficiency makes it particularly attractive for organizations that need strong performance without massive GPU clusters.

Across both tiers, Gemma 4 demonstrates significant improvements in mathematical reasoning, instruction-following, and multi-step planning compared to its predecessors.

Built for Agentic Workflows

Gemma 4's most forward-looking feature is its native support for agentic tasks. The models can perform multi-step planning, autonomous action execution, and tool use without specialized fine-tuning. This means developers can build AI agents that reason through complex tasks, execute code, search the web, and interact with APIs using Gemma 4 as the base model.

The edge models add another dimension with native audio input processing. A smartphone running Gemma 4 E2B can listen to spoken commands, analyze images from the camera, and execute multi-step actions entirely offline with near-zero latency. This opens up use cases in industrial automation, field operations, and accessibility tools where cloud connectivity is unreliable.

All models are natively trained on over 140 languages, making Gemma 4 one of the most linguistically diverse open model families available.

Ecosystem and Framework Support

Gemma 4 launches with day-one support across a comprehensive list of frameworks and platforms. Developers can use the models through Hugging Face (Transformers, TRL, Transformers.js, Candle), LiteRT-LM, vLLM, llama.cpp, MLX, Ollama, NVIDIA NIM and NeMo, LM Studio, Unsloth, SGLang, Cactus, Baseten, Docker, MaxText, Tunix, and Keras.

This broad compatibility is intentional. Since the first Gemma generation, developers have downloaded Gemma models over 400 million times and created more than 100,000 variants in what Google calls the Gemmaverse. The Apache 2.0 license removes previous restrictions on commercial use, derivative works, and redistribution, which should further accelerate community adoption.

Google has also launched the Kaggle Gemma 4 Good Hackathon, inviting developers to apply the models to challenges in health and sciences, global resilience, education, and digital equity.

Competitive Landscape

Gemma 4 enters a crowded field. Meta's Llama 4 Scout, released earlier in 2026, introduced MoE architecture to the Llama family. DeepSeek V3 and Qwen 3.5 have established strong positions among developers building with open models. Mistral's models remain popular for European deployments.

Gemma 4's advantages are clear. The Apache 2.0 license is more permissive than Meta's community license, the edge models fill a niche that few competitors address with multimodal capabilities, and the MoE architecture provides excellent efficiency. The benchmark performance at the 31B scale is competitive with models several times its size.

The main question is whether Google can convert benchmark performance into developer adoption. Llama's ecosystem benefits and DeepSeek's cost advantages are difficult to displace, even with superior technical specifications.

Conclusion

Gemma 4 represents Google's most serious open-source AI offering to date. The combination of four model sizes, Apache 2.0 licensing, multimodal edge capabilities, and strong benchmark performance makes it a compelling choice for developers building everything from smartphone applications to enterprise agentic systems. Whether it can challenge Llama's ecosystem dominance remains to be seen, but Gemma 4 has closed the gap significantly. The models are available now on Hugging Face, Kaggle, and Google AI Studio.

Editor's Verdict

Google Launches Gemma 4: Four Open Models With Agentic Skills Under Apache 2.0 stands out as one of the more compelling open source developments we've covered recently.

The strongest case for paying attention is apache 2.0 license provides maximum commercial freedom with no usage restrictions, which raises the bar for what readers should now expect from peers in this space. Reinforcing that, four model sizes cover every deployment scenario from smartphones to GPU workstations adds practical value rather than just headline appeal. The broader signal worth registering is straightforward: the shift to Apache 2.0 removes the biggest adoption barrier that limited earlier Gemma versions in commercial settings. On the other side of the ledger, largest model is 31B parameters, which may lag behind 70B+ models on the most demanding reasoning tasks is a real constraint, not a marketing footnote, and it should factor into any serious decision. Layered on top of that, edge models lack text generation output for audio, limiting speech-to-speech applications narrows the set of teams for whom this is an obvious yes.

For developers building locally, infrastructure engineers, and anyone preferring transparent, modifiable software, the answer here is to pilot now and plan for production use. For everyone else, the safer posture is to monitor coverage and revisit once the use cases that matter to your team are demonstrated in the wild.

Pros

Apache 2.0 license provides maximum commercial freedom with no usage restrictions
Four model sizes cover every deployment scenario from smartphones to GPU workstations
MoE architecture delivers near-frontier performance with a fraction of the compute cost
Native multimodal capabilities (text, image, audio) on edge models without cloud dependency
Broad framework support enables immediate integration with existing developer toolchains

Cons

Largest model is 31B parameters, which may lag behind 70B+ models on the most demanding reasoning tasks
Edge models lack text generation output for audio, limiting speech-to-speech applications
MoE architecture requires specialized inference infrastructure that not all deployment environments support
Competing with Llama's massive ecosystem and community momentum remains an uphill challenge

References

Gemma 4: Byte for byte, the most capable open models - Google Blog Bring state-of-the-art agentic skills to the edge with Gemma 4 - Google Developers Blog Google releases Gemma 4 under Apache 2.0 - VentureBeat Welcome Gemma 4: Frontier multimodal intelligence on device - Hugging Face Google's bold Gemma 4 bet targets Meta's hold on developers - RollingOut

Comments0

Key Features

1. Four model sizes (E2B 2.3B, E4B 4.5B, 26B MoE, 31B dense) spanning edge devices to workstations with 128K-256K context windows 2. Apache 2.0 license replaces the restrictive Gemma Use Policy, enabling unrestricted commercial use and redistribution 3. MoE architecture in the 26B model uses only 3.8B active parameters while ranking 6th on Arena AI 4. Native multimodal capabilities including text, image, and audio input on edge models for offline deployment 5. Agentic workflow support with multi-step planning, tool use, and autonomous action without fine-tuning

Key Insights

The shift to Apache 2.0 removes the biggest adoption barrier that limited earlier Gemma versions in commercial settings
The 26B MoE model achieving Arena AI rank 6 with only 3.8B active parameters demonstrates exceptional efficiency-per-parameter
Native audio input on edge models enables entirely new offline AI applications in industrial, medical, and accessibility domains
400 million cumulative Gemma downloads and 100,000 community variants indicate a mature developer ecosystem ready for the upgrade
Day-one support across 17 frameworks including llama.cpp and Ollama ensures immediate compatibility with existing local AI workflows
The Codeforces ELO of 2,150 on the 31B model puts its coding ability in range of models with 10x the parameter count
Multi-language training on 140+ languages positions Gemma 4 as the most linguistically diverse open model family available

Was this review helpful?

Twitter/X

Related AI Reviews

NEWOpen Source

Visit Official Site

🟠Anthropic Claude 💎Google Gemini 🤖OpenAI GPT