Exploding in Popularity Upon Launch! The Pinnacle of Chinese Voice AI, ChatTTS, Officially Launches Its Website
-
Remember the Chinese voice AI ChatTTS, which we previously recommended as the pinnacle of its kind? This text-to-speech project, which can rival GPT-4o, exploded in popularity upon launch, garnering 16.9K stars on GitHub in just a few days.
Now, ChatTTS has officially launched its website, allowing all users to experience it online directly.
Key Features:
- Text-to-Speech: Input text in the text box, and ChatTTS will generate corresponding speech, automatically adjusting rhythm and pauses.
- Real-Time Voice Conversation: Combined with large language models, it enables real-time voice conversation functionality.
- Voice Tone Adjustment: In the "Audio Seed" section, you can adjust the speaker's tone by specifying a number or randomly generate a tone by rolling the dice.
- Detail Control: Users can add special markers like
[laugh]
and[uv_break]
in the text to manually control effects such as laughter and pauses.
Outstanding Features of ChatTTS
-
Multilingual Support: ChatTTS not only supports Chinese but also generates natural and fluent English speech. Its mixed Chinese-English speech performance is excellent, with almost no detectable AI-generated traces.
-
Fine-Grained Control: ChatTTS allows users to control laughter, pauses between speech, and interjections, making the generated voice more natural and vivid.
-
Multi-Speaker Support: ChatTTS supports multi-speaker voice synthesis, capable of replicating various voices, including classic voices of deceased figures.
-
Large-Scale Training Data: The largest ChatTTS model was trained on over 100,000 hours of Chinese and English data. The version open-sourced on HuggingFace used 40,000 hours of training data but was not fine-tuned with supervision (SFT).
Application Scenarios of ChatTTS
ChatTTS is suitable for various scenarios requiring high-quality speech synthesis, including but not limited to:
- E-commerce Live Streaming: Provides more natural voiceovers for live streams, enhancing user experience.
Self-Media: Assists content creators in generating lively voiceovers to attract more viewers.
Online Education: Provides clear and natural narration for online courses to improve learning outcomes.
Customer Service & After-Sales: Delivers more human-like voice services to enhance customer satisfaction.
Online Usage
Official Website: https://chattts.com/
Project Address: https://top.aibase.com/tool/chattts
text: Refers to the written content that needs to be converted into speech.
Refine text: Option to automatically optimize the input text.
Randomness: A parameter controlling output variability. Higher values increase randomness in generated speech, which may sometimes improve or degrade quality.
Voice Selection: Default value is 2222. This numeric parameter selects voice types. Options include 2222, 7869, 6653, 4099, 5099, or any other number for random selection.
Custom Voice: A positive integer parameter for customizing pitch and timbre. When set, this overrides the voice selection parameter.
Prompt Settings: Used to add effects like laughter or pauses. Example: [oral_2][laugh_0][break_6].
Note: This model's advantage is being open-source, allowing training with personal voice data.
Important: Always comply with laws and ethical standards when using.