Open Source

Explore the latest AI open-source projects from GitHub and HuggingFace.

Evermx

Latest AI/LLM news and in-depth reviews.
We analyze usability, potential, and trade-offs.

info@evermx.com

LLM

Claude
Gemini
GPT
Llama
Other LLM

Official Sites

Anthropic (Claude)
Google AI (Gemini)
OpenAI (GPT)
Meta AI (Llama)
Hugging Face

About Editorial Policy Contact Privacy Policy Terms of Service

Reviews Tools Open Source Live Official Profile

LLaVA-OneVision-1.5 - Open Source | Evermx | Evermx

Back to Open Source

Trending

LLaVA-OneVision-1.5

EvolvingLMMs-LabApache-2.0

View on GitHub

Multimodal776 Stars62 Forks133 views

LLaVA-OneVision-1.5 is a fully open-source framework for democratized multimodal training from EvolvingLMMs Lab. It operates on native-resolution images using a three-stage training pipeline with an 85M concept-balanced pretraining dataset and 22M instruction dataset. The complete framework trains state-of-the-art 4B and 8B multimodal models within a $16,000 compute budget, surpassing Qwen2.5-VL across multiple benchmarks. A reinforcement learning extension was also released in late 2025.

Key Features

Native-resolution image processing without downsampling
Three-stage training pipeline with fully open datasets and configs
Achieves state-of-the-art performance within a $16K compute budget
Reinforcement learning extension for enhanced visual reasoning

Related Projects

TrendingMultimodal

GitHub

80.3K11.7K

Deep-Live-Cam

hacksider

Real-time AI face swap and one-click video deepfake with only a single image

MoneyPrinterTurbo

harry0703

AI-powered short video generator that automates scripting, footage sourcing, subtitles, and composition — supporting 10+ LLM providers and batch production.

BitNet

microsoft

Microsoft's official 1-bit LLM inference framework achieving 1.37x-6.17x speedup and up to 82% energy reduction, enabling 100B parameter models to run on consumer CPUs.

MIT245