Reviews AI Tools Open Source Live News AI Official

Open Source

Explore the latest AI open-source projects from GitHub and HuggingFace.

Open Generative AI - Open Source | Evermx | Evermx

Back to Open Source

Trending

Open Generative AI

Anil-matchaMIT

View on GitHub

Vision15.2K Stars2.6K Forks103 views

Open Generative AI is an open-source desktop and web studio for AI image, video, and lip-sync generation across more than 200 models from a single interface. Maintained by Anil-matcha, the project has crossed 15,200 GitHub stars and is released under the MIT license, which makes it free to self-host and modify. It targets users who want an unbranded, unfiltered, configurable alternative to commercial generative video and image platforms. ## What It Provides The studio is organized into five tools. Image Studio handles text-to-image generation across 50+ models and image-to-image edits across 55+ models, with support for up to 14 reference images at once. Video Studio covers text-to-video (40+ models) and image-to-video (60+ models). Lip Sync Studio drives portrait animation and video lipsync from audio across nine models. Cinema Studio exposes professional camera controls (lens, focal length, aperture) for generation runs that need consistent cinematographic framing. Workflow Studio is a node-based pipeline builder that lets users chain multi-step operations, for example generating an image, animating it to a clip, then attaching lipsync audio. Supported model families include Flux, Nano Banana 2, Seedream, Ideogram, Kontext, GPT-4o Edit, Kling, Sora, Veo, Wan, Seedance, Hailuo, Runway, Hunyuan, Infinite Talk, LTX Lipsync, LatentSync, Sync, and others. The model list updates as the underlying provider catalog expands. ## Architecture The codebase is a Next.js 14 monorepo built around React 18 and Tailwind CSS, with a shared `packages/studio` component library that backs both the desktop and web versions. Generation requests use a two-step pattern: the client submits a job to a model endpoint, then polls for the result. Reference image uploads go through a standard multipart/form-data endpoint. For cloud-hosted models the app uses Muapi.ai as its API gateway. Users supply their own Muapi API key, which the application stores in browser localStorage and only transmits to Muapi. This keeps the gateway integration simple while preserving the project's open-source positioning, because the Muapi key is the only piece of vendor coupling and users can swap it out for their own keys at any time. ## Local Inference The desktop build adds two local inference engines for users who want to run models without a cloud API. sd.cpp is bundled with the desktop app and runs Stable Diffusion-class models locally. It supports Metal GPU acceleration on Apple Silicon, CUDA, Vulkan, and ROCm on Windows and Linux, and falls back to CPU when no GPU is available. The bundled model list includes Z-Image Turbo/Base, Dreamshaper 8, Realistic Vision, Anything v5, and SDXL. Roughly 16 GB of RAM is recommended for Z-Image; older SD 1.5 models run comfortably on 8 GB. Wan2GP runs as a user-hosted Gradio server and handles the larger, GPU-only models such as Flux.1 Dev, Qwen Image, Wan 2.2, Hunyuan Video, and LTX Video. The split design is useful for macOS users who can keep the Open Generative AI UI on their Mac and point it at a Linux or Windows GPU box running Wan2GP for the heavy inference. ## Deployment Options There are three deployment paths. The hosted version at muapi.ai/open-generative-ai requires no installation. The desktop app ships one-click installers for Intel and Apple Silicon Macs, Windows, and Linux, and is the recommended path for users who want local inference. For developers who want to extend the studio there is a source install path: clone with `git clone --recurse-submodules`, run `npm run setup`, and launch with `npm run electron:dev` or `npm run dev`. Node.js 18+ is the only prerequisite for the source path. ## Notable Design Choices The project explicitly markets itself as having "no content filters or prompt rejections" and being "uncensored." This is a real differentiator from hosted commercial platforms, and it is also a real responsibility on the user to apply their own policy. The MIT license and full source availability means policy decisions belong to the operator, not the upstream project. The Workflow Studio is one of the more interesting components. By chaining image generation, image-to-video animation, and lipsync into a single named workflow, creators can produce repeatable pipelines that compress what would normally be three or four separate tool sessions into one. Combined with the Muapi gateway, that workflow can pull from whichever models give the best result for each stage. ## Use Cases Content creators use the studio for social media assets, music videos, and short-form promotional content where they want to compare outputs across multiple model families before settling on one. Developers integrate the Generative-Media-Skills companion library to automate media pipelines from their own code. Video editors use the lip-sync tools to localize talking-head content. Teams with GPU infrastructure benefit from the local inference engines, which keep costs predictable and data on-premises. Researchers use the project as a unified harness for evaluating open-source generative model releases as they come out. ## Limitations Local inference is meaningfully constrained by VRAM. Flux and Qwen-class models through Wan2GP need real discrete-GPU memory; macOS users will need a separate GPU machine to run them at usable speed. The Muapi gateway dependency, while well-documented, is still a single coupling point for cloud-hosted models, and operators who want to swap to another gateway will need to adapt the API client. The web version does not support local inference at all, so users who want offline-only operation must use the desktop build. The project's "no filters" stance puts policy enforcement responsibility entirely on the operator, which is an issue for some deployment contexts. Finally, the rapid pace of underlying model releases means the supported-model list can drift relative to the README, and users should expect to track upstream changes more actively than with closed platforms. ## Who Should Use Open Generative AI The studio is a strong fit for creators who use multiple generative model families and want a unified interface, for developers building media pipelines that benefit from a node-based workflow runner, and for teams with GPU hardware that want to keep generation local. It is less suitable for users who explicitly want content moderation handled upstream, or for operators who prefer not to take on the policy-and-safety responsibilities that come with an unfiltered generation stack.

Key Features

Unified studio for image, video, and lip-sync generation across 200+ models
Image Studio: 50+ text-to-image and 55+ image-to-image models with up to 14 references
Video Studio: 40+ text-to-video and 60+ image-to-video models
Cinema Studio with professional camera controls (lens, focal length, aperture)
Node-based Workflow Studio for chaining multi-step generation pipelines
Local inference via bundled sd.cpp (Metal/CUDA/Vulkan/ROCm/CPU)
GPU offload to a user-hosted Wan2GP Gradio server for Flux/Qwen/Wan/Hunyuan/LTX
Cross-platform desktop installers for macOS, Windows, and Linux

Related Projects

TrendingVision

GitHub

108.4K12.6K

ComfyUI

Comfy-Org

GPL-3.0206

Open Source

Open Generative AI

Key Features

Tags

Related Projects

ComfyUI

PaddleOCR

Ultralytics YOLO

Roboflow Supervision