Open Source
Explore the latest AI open-source projects from GitHub and HuggingFace.
Explore the latest AI open-source projects from GitHub and HuggingFace.
ChatTTS is an open-source text-to-speech model built specifically for dialogue, the kind of speech an LLM assistant needs to sound natural in conversation. Since its release it has become one of the most starred TTS projects on GitHub, with nearly 40,000 stars, thanks to expressive, conversational output and fine-grained control over how lines are spoken. The repository provides the core algorithm and simple examples, while community projects extend it into end-user applications. ## Built for Dialogue Most TTS systems are optimized for reading a single block of text aloud. ChatTTS is instead tuned for dialogue scenarios, producing speech that feels interactive and supports multiple speakers within a conversation. This focus makes it a natural fit for voice assistants, chatbots, and any pipeline where a language model's text responses need to be spoken back to a user. ## Fine-Grained Prosody Control A defining feature is control over prosody at a fine level. The model can predict and insert conversational cues such as laughter, pauses, and interjections, which are exactly the elements that make synthetic speech sound human rather than robotic. The authors report that its prosody surpasses most open-source TTS models, and they release pretrained weights to support further research and development. ## Data and Models The main model is trained on more than 100,000 hours of Chinese and English audio. The open-source release on Hugging Face is a 40,000-hour pretrained base model without supervised fine-tuning (SFT), intended for academic and research use. Practical features include streaming audio generation for lower perceived latency and an open DVAE encoder with zero-shot inference code. English and Chinese are supported today, with additional languages on the roadmap alongside planned multi-emotion control. ## Practical Use ChatTTS is distributed as a PyPI package with Colab notebooks, making it straightforward to generate speech from a few lines of Python. For full applications, the community maintains the Awesome-ChatTTS index that points to downstream tools and integrations, so most users do not need to build a product layer from scratch. ## Considerations Licensing is split: the code is published under AGPLv3+, while the model weights are released under CC BY-NC 4.0, a non-commercial license, so commercial deployment of the released model is restricted and should be reviewed carefully. The team also notes the model is for academic purposes and has taken deliberate steps to limit misuse — adding high-frequency noise during training and compressing audio quality — with a detection model planned. Buyers of fully commercial, high-fidelity TTS will need a different path, but for researchers and developers building conversational voice experiences, ChatTTS is among the most expressive open options available.