Open Source

Meta's MIT-licensed deep-learning library for audio processing and generation. Bundles the state-of-the-art EnCodec neural audio codec with MusicGen, a controllable music generation model supporting both textual and melodic conditioning — a foundational toolkit for open-source audio AI.

MIT68

TrendingAudio

GitHub

1.7K198

SongGeneration (LeVo)

tencent-ailab

Tencent AI Lab's open-source song generation system (LeVo) built on a hybrid LM-plus-diffusion architecture and multi-preference alignment. Generates full songs up to 4m30s from structured lyrics and style tags, with dual-track vocal/accompaniment modeling and multilingual support.

ACE-Step 1.5

ace-step

MIT-licensed open-source music generation foundation model that pairs a Qwen3-based LM planner with a Diffusion Transformer decoder. Runs text-to-music, cover generation, repainting, and track separation locally across NVIDIA, AMD, Intel, and Apple hardware, generating a full song in under two seconds on an A100.

RF-DETR

roboflow

Apache-2.0 real-time transformer detector from Roboflow, built on a DINOv2 backbone and presented at ICLR 2026. Delivers SOTA accuracy/latency trade-offs on COCO and RF100-VL, handles object detection and instance segmentation in one API, is designed for fine-tuning, and installs with a single pip command — a license-friendly alternative to AGPL YOLO models.

Jan

janhq

Apache-2.0 desktop app from Menlo Research that runs LLMs like Llama, Gemma, Qwen, and GPT-oss fully offline as a privacy-first ChatGPT alternative — with optional cloud routing to OpenAI, Anthropic, Mistral, and Groq, custom assistants, MCP agentic support, and an OpenAI-compatible local server at localhost:1337.

Marker

datalab-to

GPL-3.0 open-source engine from Datalab that converts PDF, image, PPTX, DOCX, XLSX, HTML, and EPUB documents into clean Markdown, JSON, chunks, and HTML — with structure-aware parsing of tables, math, and forms, optional LLM-assisted accuracy, JSON-schema structured extraction, and ~25 pages/sec batch throughput on an H100.

MLX-VLM

Blaizzy

MIT-licensed package for running and fine-tuning Vision Language Models on Apple Silicon using MLX — supports Qwen2-VL, Gemma 4, Phi-4, MiniCPM, DeepSeek-OCR, Pixtral, LLaVA and more, with CLI, Python, Gradio, and FastAPI interfaces, speculative decoding (DFlash, EAGLE-3, MTP) for 2-4x speedup, KV cache quantization, LoRA/QLoRA fine-tuning, and distributed inference across multiple Macs.

LiteLLM

BerriAI

MIT-licensed Python SDK and self-hosted AI Gateway that exposes 100+ LLM providers — OpenAI, Anthropic, Gemini, Bedrock, Azure, VertexAI, vLLM, NVIDIA NIM, and more — through an OpenAI-compatible interface, with virtual keys, spend tracking, load balancing, fallbacks, guardrails, and observability callbacks for Lunary, MLflow, and Langfuse.

vLLM-Omni

vllm-project

Apache-2.0 omni-modality inference engine from the vllm-project organization — extends vLLM's PagedAttention and continuous batching to text, image, video, audio, TTS, and diffusion workloads with full disaggregation, heterogeneous pipelines, and CUDA/ROCm/MUSA/NPU/XPU backends.

Anthropic Knowledge Work Plugins

anthropics

Anthropic's open-source library of 11 Claude Cowork plugins for sales, marketing, finance, legal, support, product, data, bio-research, and enterprise search — all written in markdown and JSON with MCP connectors, no code required to install or customize.

OpenMed

maziyarpanahi

Apache 2.0 on-device healthcare AI platform with 1,000+ specialized medical models, HIPAA-grade PII de-identification across 12 languages and all 18 Safe Harbor identifiers, 24–33x MLX speedup on Apple Silicon, and a native Swift framework (OpenMedKit) for iOS/iPadOS/macOS clinical apps.

whichllm

Andyyyy64

MIT-licensed CLI that recommends the best local LLM for your hardware, ranked by real, recency-aware benchmarks from LiveBench, Aider, Chatbot Arena, and others — with confidence-tagged scores, MoE-aware speed estimates, and support for NVIDIA, AMD, Apple Silicon, and CPU-only systems.

MIT41

4 5 6 7 8 9 10 11 12