SupertonicTTS

This page explains how to use sherpa-onnx with SupertonicTTS.

Hint

Support of this model in sherpa-onnx is contributed by https://github.com/Wasser1462 in the PR https://github.com/k2-fsa/sherpa-onnx/pull/3605.

SupertonicTTS 3 is an offline multi-speaker, multi-language TTS model supporting 31 languages. In a typical setup, you select a speaker with --sid and a language with --lang.

You can try it online at HuggingFace Spaces.

The following table lists the supported languages and links to their documentation, download instructions, and code examples.

Language	Documentation
Arabic	supertonic-3-ar
Bulgarian	supertonic-3-bg
Croatian	supertonic-3-hr
Czech	supertonic-3-cs
Danish	supertonic-3-da
Dutch	supertonic-3-nl
English	supertonic-3-en
Estonian	supertonic-3-et
Finnish	supertonic-3-fi
French	supertonic-3-fr
German	supertonic-3-de
Greek	supertonic-3-el
Hindi	supertonic-3-hi
Hungarian	supertonic-3-hu
Indonesian	supertonic-3-id
Italian	supertonic-3-it
Japanese	supertonic-3-ja
Korean	supertonic-3-ko
Latvian	supertonic-3-lv
Lithuanian	supertonic-3-lt
Polish	supertonic-3-pl
Portuguese	supertonic-3-pt
Romanian	supertonic-3-ro
Russian	supertonic-3-ru
Slovak	supertonic-3-sk
Slovenian	supertonic-3-sl
Spanish	supertonic-3-es
Swedish	supertonic-3-sv
Turkish	supertonic-3-tr
Ukrainian	supertonic-3-uk
Vietnamese	supertonic-3-vi

Download a pre-trained model

Download the released SupertonicTTS archive from https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models:

wget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2
tar xf sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2
rm sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

Run a command-line example

The following command uses the same model files as rust-api-examples/examples/supertonic_tts.rs:

./build/bin/sherpa-onnx-offline-tts \
  --supertonic-duration-predictor=./sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx \
  --supertonic-text-encoder=./sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx \
  --supertonic-vector-estimator=./sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx \
  --supertonic-vocoder=./sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx \
  --supertonic-tts-json=./sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json \
  --supertonic-unicode-indexer=./sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin \
  --supertonic-voice-style=./sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin \
  --sid=0 \
  --lang=en \
  --output-filename=./supertonic.wav \
  "Today as always, men fall into two groups: slaves and free men."

You can change --lang to any of the 31 supported language codes (e.g., ja for Japanese, ko for Korean, fr for French).

You can also use this tracked helper script:

rust-api-examples/run-supertonic-tts.sh

API examples

Additional example code is available here:

Notes

Use --sid to choose a speaker.
Use --lang to select the synthesis language (e.g., en, ja, ko, fr, etc.).
The model files include tts.json and unicode_indexer.bin in addition to ONNX files.

SupertonicTTS

Download a pre-trained model

Run a command-line example

API examples

Notes

See also