Trending

Index-TTS

index-ttsOther

TTS19.6K Stars2.4K Forks188 views

Index-TTS is an industrial-level, controllable zero-shot text-to-speech system that introduces precise duration control into an autoregressive TTS architecture while preserving natural prosody from voice prompts. It achieves disentanglement of speaker identity and emotional expression, enabling independent control over timbre and emotion through audio reference, emotional vector specification, or free-text description, with support for Chinese and English and optional FP16 and DeepSpeed acceleration.

Key Features

First autoregressive TTS model with precise synthesis duration control in both explicit and free-form modes
Disentangled speaker identity and emotional expression for independent timbre and emotion control
Three emotion control modes: audio reference, 8-float emotional vector, and text description
Zero-shot voice cloning with faithful target timbre reproduction from short audio prompts
FP16 inference, DeepSpeed acceleration, and compiled CUDA kernels for production efficiency

Open Source

Index-TTS

Key Features

Tags

Related Projects

GPT-SoVITS

VibeVoice

VoxCPM2

Chatterbox