Open Source

Explore the latest AI open-source projects from GitHub and HuggingFace.

Evermx

Latest AI/LLM news and in-depth reviews.
We analyze usability, potential, and trade-offs.

info@evermx.com

LLM

Claude
Gemini
GPT
Llama
Other LLM

Official Sites

Anthropic (Claude)
Google AI (Gemini)
OpenAI (GPT)
Meta AI (Llama)
Hugging Face

About Editorial Policy Contact Privacy Policy Terms of Service

Reviews Tools Open Source Live Official Profile

GLM-V - Open Source | Evermx | Evermx

Back to Open Source

Trending

GLM-V

zai-orgApache-2.0

View on GitHub

Multimodal2.2K Stars160 Forks76 views

GLM-V is an open-source multimodal vision-language model family from Z.ai (the team behind the GLM series) covering GLM-4.6V, GLM-4.5V, and GLM-4.1V-Thinking. These models achieve versatile multimodal reasoning via scalable reinforcement learning, supporting text, images, video, documents, GUI agents, and 3D spatial tasks. GLM-4.6V features a 128K context window and native multimodal tool use, accepting images directly as tool parameters.

Key Features

Native multimodal tool use - images as direct tool parameters
128K context window for long document and video understanding
Scalable reinforcement learning for improved visual reasoning
Supports GUI agents, document parsing, and 3D spatial understanding

Related Projects

TrendingMultimodal

GitHub

80.3K11.7K

Deep-Live-Cam

hacksider

Real-time AI face swap and one-click video deepfake with only a single image

MoneyPrinterTurbo

harry0703

AI-powered short video generator that automates scripting, footage sourcing, subtitles, and composition — supporting 10+ LLM providers and batch production.

BitNet

microsoft

Microsoft's official 1-bit LLM inference framework achieving 1.37x-6.17x speedup and up to 82% energy reduction, enabling 100B parameter models to run on consumer CPUs.

MIT159