An overview of all the endpoint parameters.
Text
This is the text to be synthesized to audio.
VoiceId
Dan: Young MaleWill: Mature MaleScarlett: Young FemaleLiv: Young FemaleAmy: Mature Female
Bitrate
Defaults to 192k.
- Use lower values for low bandwidth or to reduce the transferred file size.
- Use higher values for higher fidelity.
Speed
Defaults to 0. Examples:
0.5: makes the audio 50% faster (i.e. 60-second audio becomes 42 seconds)-0.5: makes the audio 50% slower (i.e. 60-second audio becomes 90 seconds)
Pitch
Defaults to 1. However, on the landing page, we default male voices to 0.92 as people tend to prefer lower/deeper male voices.
Codec
Defaults to libmp3lame (MP3).
- Use
pcm_mulawfor phone calls.pcm_s16lereturns 22050 Hz raw audio.
Temperature
Defaults to 0.25.
- The lower values make audio deterministic and more stable.
- The higher values make audio more expressive and less-deterministic.
- With a high Temperature value, audio will be different every time. However, it also increases the probability of mispronunciation.
TimestampType
By default, the endpoint returns per-sentence timestamps. Use word to get per-word timestamps.
The timestamp feature is currently not supported via the /stream endpoint.
CallbackUrl
If provided, the server will POST a JSON body to the CallbackUrl. A sample body looks like below:
{
"TaskId": "8282b92d",
"TaskStatus": "completed", // or "failed"
}