OmniVoice Demo

State-of-the-art text-to-speech model for 600+ languages, supporting:

Built with OmniVoice by Xiaomi Next-gen Kaldi team.

Text to Synthesize / 待合成文本

Reference Audio / 参考音频

Recommended: 3–10 seconds audio.

Reference Text (optional) / 参考音频文本（可选）

Language (optional) / 语种 (可选)

Keep as Auto to auto-detect the language.

Output Audio / 合成结果

Status / 状态