Unreal Speech Node.js SDK allows you to easily integrate the Unreal Speech API into your Node.js applications for text-to-speech (TTS) synthesis. This package provides convenient methods for working with the Unreal Speech API, including generating speech, managing synthesis tasks, and streaming audio.

To use the play utility, you should have FFmpeg installed on your system

Getting Started

FFmpeg Installation

Windows
Download FFmpeg: Go to the FFmpeg official website (https://ffmpeg.org/download.html) and download the latest build for Windows.

Mac

Install Homebrew: If not already installed, open Terminal and run /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)".

Installation

npm i unrealspeech

Available endpoints

Endpoint	Description
/stream	Stream audio for short, time-sensitive cases
/speech	Generate speech with options (MP3 format)
/synthesisTasks	Manage synthesis tasks for longer text
/synthesisTasks/TaskId	Check the status of a synthesis task

Common Request Body Schema

Property	Type	Required?	Default Value	Allowed Values
VoiceId	string	Required	N/A	Scarlett, Liv, Dan, Will, Amy
Bitrate	string	Optional	192k	16k, 32k, 48k, 64k, 128k, 192k, 256k, 320k
Speed	float	Optional	0	-1.0 to 1.0
Pitch	float	Optional	1.0	0.5 to 1.5

Parameter Details

Voice ID

Dan: Young Male
Will: Mature Male
Scarlett: Young Female
Liv: Young Female
Amy: Mature Female

Bitrate

Defaults to 192k. Use lower values for low bandwidth or to reduce the transferred file size. Use higher values for higher fidelity.

Speed

Defaults to 0.

Examples

0.5: makes the audio 50% faster. (i.e., 60-second audio becomes 42 seconds)
-0.5: makes the audio 50% slower. (i.e., 60-second audio becomes 90 seconds.)

Pitch

Defaults to 1. However, on the landing page, we default male voices to 0.92 as people tend to prefer lower/deeper male voices.

Rate Limit

Plan	Requests per second
Free	1
Basic	2
Pro	8

Usage

To use the SDK, you need to initialize it with your API key and other required configurations. Initialization

import UnrealSpeech, {play, save } from "unrealspeech";
const unrealSpeech = new UnrealSpeech("your_api_key");

Methods

stream(text, voiceId, bitrate, speed, pitch, codec, temperature)

This method streams the synthesized speech based on the provided parameters.

text: The text to be synthesized.
voiceId: The ID of the voice to be used.
bitrate: The bitrate of the audio.
timestampType: The type of timestamp to be used.
speed: The speed of speech.
pitch: The pitch of speech.

Returns: A promise that resolves to the synthesized speech buffer.

speech(text, voiceId, bitrate, timestampType)

This method synthesizes speech based on the provided text and voice.

text: The text to be synthesized.
voiceId: The ID of the voice to be used.
bitrate: The bitrate of the audio.
timestampType: The type of timestamp to be used.
speed: The speed of speech.
pitch: The pitch of speech.

Returns: A promise that resolves to the synthesized speech data.

createSynthesisTask(text, voiceId, bitrate, timestampType)

This method creates a synthesis task for the provided text and voice.

text: The text to be synthesized.
voiceId: The ID of the voice to be used.
bitrate: The bitrate of the audio.
timestampType: The type of timestamp to be used.
speed: The speed of speech.
pitch: The pitch of speech.

Returns: A promise that resolves to the ID of the created synthesis task.

getSynthesisTaskStatus(taskId)

This method retrieves the status of a synthesis task based on the provided task ID.

taskId: The ID of the synthesis task.

Returns: A promise that resolves to the status of the synthesis task.

Configuration Options

apiKey: Your API key for authentication.
Other configuration options and their descriptions.

Examples

Stream

This method streams the synthesized speech based on the provided parameters.

import UnrealSpeech, {play, save } from "unrealspeech";
const unrealSpeech = new UnrealSpeech("your_api_key");

const speechBuffer = await unrealSpeech.stream({
  text: "Hello, world!",
  voiceId: "Scarlett",
  bitrate: "192k",
  timestampType: "word",
  speed:0,
  pitch: 1.0
});

// play audio
play(speechBuffer);
// save audio
save(speechBuffer, "filename.mp3");

createSynthesisTask

import UnrealSpeech, {play, save } from "unrealspeech";
const unrealSpeech = new UnrealSpeech("your_api_key");

const taskId = await unrealSpeech.createSynthesisTask({
 	text:  "Hello, world!",
  voiceId: "Scarlett",
  bitrate: "192k",
  timestampType: "word",
  speed: 0,
  pitch: 1.0
});

// Pass the ID of the created synthesis task to getSynthesisTaskStatus
console.log(taskId);

getSynthesisTaskStatus

import UnrealSpeech, {play, save } from "unrealspeech";
const unrealSpeech = new UnrealSpeech("your_api_key");

const taskId = "task123"; 
const status = await unrealSpeech.getSynthesisTaskStatus(taskId);

// Use the status of the synthesis task as needed
console.log(status);

speech

import UnrealSpeech, {play, save } from "unrealspeech";
const unrealSpeech = new UnrealSpeech("your_api_key");

const speechData = await unrealSpeech.speech({
 	text:  "Hello, world!",
  voiceId: "Scarlett",
  bitrate: "192k",
  timestampType: "word",
  speed: 0,
  pitch: 1.0
});

// Use the synthesized speech data as needed
console.log(speechData);

Contribution

Contributions are welcome! If you find any issues or have suggestions for improvements, please open an issue or submit a pull request.