Parameter Details

An overview of all the endpoint parameters.

Text

This is the text to be synthesized to audio.

VoiceId

Dan: Young Male
Will: Mature Male
Scarlett: Young Female
Liv: Young Female
Amy: Mature Female

Bitrate

Defaults to 192k.

Use lower values for low bandwidth or to reduce the transferred file size.
Use higher values for higher fidelity.

Speed

Defaults to 0. Examples:

0.5: makes the audio 50% faster (i.e. 60-second audio becomes 42 seconds)
-0.5: makes the audio 50% slower (i.e. 60-second audio becomes 90 seconds)

Pitch

Defaults to 1. However, on the landing page, we default male voices to 0.92 as people tend to prefer lower/deeper male voices.

Codec

Defaults to libmp3lame (MP3).

Use pcm_mulaw for phone calls. pcm_s16le returns 22050 Hz raw audio.

Temperature

Defaults to 0.25.

The lower values make audio deterministic and more stable.
The higher values make audio more expressive and less-deterministic.
With a high Temperature value, audio will be different every time. However, it also increases the probability of mispronunciation.

TimestampType

By default, the endpoint returns per-sentence timestamps. Use word to get per-word timestamps.

The timestamp feature is currently not supported via the /stream endpoint.

CallbackUrl

If provided, the server will POST a JSON body to the CallbackUrl. A sample body looks like below:

{  
   "TaskId": "8282b92d",  
   "TaskStatus": "completed", // or "failed"  
 }