Reviews AI Tools Open Source Live News AI Official

Open Source

Explore the latest AI open-source projects from GitHub and HuggingFace.

Chatterbox - Open Source | Evermx | Evermx

Back to Open Source

Trending

Chatterbox

resemble-aiMIT

View on GitHub

TTS25.3K Stars3.4K Forks5 views

Chatterbox is a family of state-of-the-art, open-source text-to-speech models from Resemble AI. Released under a permissive MIT license and sitting at roughly 25,000 GitHub stars, it has become one of the most widely adopted open TTS stacks, offering high-quality voice cloning and expressive speech generation without the recurring cost or data-sharing of a hosted API. ## Multilingual Voice Cloning at 0.5B The flagship model, Chatterbox Multilingual V3, is a general-purpose multilingual TTS model that keeps a compact 0.5B parameter size while improving speaker similarity and reducing hallucinations. It is designed for broad language coverage with more consistent voice identity and accent preservation, making cross-language voice cloning noticeably more stable than earlier releases. For teams that need tighter quality control on specific languages, Resemble also ships a Single Language Pack of dedicated finetunes where regional-dialect performance matters most. ## Chatterbox-Turbo for Low-Latency Agents Alongside the multilingual model, Chatterbox-Turbo targets real-time English voice agents. Built on a streamlined 350M parameter architecture, Turbo delivers high-quality speech using less compute and VRAM than the larger models. A key optimization is the distilled speech-token-to-mel decoder: what was previously a ten-step bottleneck is reduced to a single step while retaining high-fidelity audio, which is what makes sub-second generation practical on modest hardware. ## Paralinguistic Tags for Expressive Speech Turbo makes paralinguistic tags native to the model, letting users insert cues like [cough], [laugh], and [chuckle] directly into text to add realism and emotion. While the feature was built primarily for conversational voice agents, it also benefits narration and creative workflows where flat, monotone synthesis breaks immersion. Combined with zero-shot voice cloning, this gives creators fine-grained control over how a generated voice actually performs a line. ## Practical Deployment Chatterbox is distributed with model weights on Hugging Face and a public demo Space, so evaluation does not require a local build. Because the models are relatively small and the license is permissive MIT, they can be embedded in commercial products, self-hosted for privacy, or fine-tuned for a specific voice. Resemble AI positions its paid service as the scale-up path for production deployments that need guaranteed ultra-low latency, but the open models are fully usable on their own. ## Considerations As with any capable voice-cloning system, Chatterbox raises legitimate concerns about consent and misuse, and responsible deployment requires attention to how cloned voices are sourced and used. Running the larger multilingual model well still benefits from a GPU, so the lightest experience comes from Turbo rather than V3. There is also a natural pull toward Resemble's hosted service for the lowest-latency production scenarios. For developers who want a strong, permissively licensed open TTS foundation with genuine voice-cloning quality, though, Chatterbox is one of the most compelling options available today.

Key Features

Chatterbox Multilingual V3: 0.5B general-purpose multilingual TTS with voice cloning
Single Language Pack of dedicated finetunes for priority languages
Chatterbox-Turbo: 350M architecture for low-latency English voice agents
Distilled decoder reduces mel generation from 10 steps to 1 for fast synthesis
Native paralinguistic tags ([laugh], [cough], [chuckle]) for expressive speech
Permissive MIT license with weights and demo Space on Hugging Face

Related Projects

TrendingTTS

GitHub

58.9K6.4K

GPT-SoVITS

RVC-Boss

MIT33

Open Source

Chatterbox

Key Features

Tags

Related Projects

GPT-SoVITS

VibeVoice

ChatTTS

VoxCPM2