Text-to-Speech

Open-source text-to-speech models and voice synthesis engines

Text-to-Speech — comparison of VoxCPM, fish-speech, CosyVoice, sherpa-onnx
VoxCPM2: Tokenizer-Free TTS for Multilingual Speech Generation, Creative Voice Design, and True-to-Life Cloning
SOTA Open Source TTS
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS, Raspberry Pi, RISC-V, RK NPU, Axera NPU, Ascend NPU, x86_64 servers, websocket server/client, support 12 programming languages
Popularity
Stars31,36630,88021,76113,089
Global Rank#1029#1062#1899#3776
Weekly Activity(Jun 16 – Jun 22)
New Stars+30+2+6+5
Pushes0000
Issues Closed0100
Community
Forks3,5402,6362,5081,500
Contributors279752210
Open Issues12414783591
Project Info
OwnerOpenBMBfishaudioFunAudioLLMk2-fsa
LicenseApache-2.0NOASSERTIONApache-2.0Apache-2.0
LanguagePythonPythonPythonC++
CreatedSep 2025Oct 2023Jul 2024Sep 2022