Text-to-Speech

Open-source text-to-speech models and voice synthesis engines

Text-to-Speech — comparison of fish-speech, CosyVoice, VoxCPM, sherpa-onnx
SOTA Open Source TTS
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
VoxCPM2: Tokenizer-Free TTS for Multilingual Speech Generation, Creative Voice Design, and True-to-Life Cloning
Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS, Raspberry Pi, RISC-V, RK NPU, Axera NPU, Ascend NPU, x86_64 servers, websocket server/client, support 12 programming languages
Popularity
Stars29,92320,73315,64111,815
Global Rank#1049#1931#2854#4184
Weekly Activity(Apr 18 – Apr 24)
New Stars+26+9+296+20
Pushes0011
Issues Closed0111
Community
Forks2,5202,3821,8381,353
Contributors945225199
Open Issues3383674549
Project Info
OwnerfishaudioFunAudioLLMOpenBMBk2-fsa
LicenseNOASSERTIONApache-2.0Apache-2.0Apache-2.0
LanguagePythonPythonPythonC++
CreatedOct 2023Jul 2024Sep 2025Sep 2022