Learn R Programming

text2speech (version 1.0.0)

tts: Text-to-Speech (Speech Synthesis)

Description

Convert text-to-speech using various engines, including Amazon Polly, Coqui TTS, Google Cloud Text-to-Speech API, and Microsoft Cognitive Services Text to Speech REST API.

With the exception of Coqui TTS, all these engines are accessible as R packages:

  • aws.polly is a client for Amazon Polly.

  • googleLanguageR is a client to the Google Cloud Text-to-Speech API.

  • conrad is a client to the Microsoft Cognitive Services Text to Speech REST API

Usage

tts(
  text,
  output_format = c("mp3", "wav"),
  service = c("amazon", "google", "microsoft", "coqui"),
  bind_audio = TRUE,
  ...
)

tts_amazon( text, output_format = c("mp3", "wav"), voice = "Joanna", bind_audio = TRUE, save_local = FALSE, save_local_dest = NULL, ... )

tts_google( text, output_format = c("mp3", "wav"), voice = "en-US-Standard-C", bind_audio = TRUE, save_local = FALSE, save_local_dest = NULL, ... )

tts_microsoft( text, output_format = c("mp3", "wav"), voice = NULL, bind_audio = TRUE, save_local = FALSE, save_local_dest = NULL, ... )

tts_coqui( text, exec_path, output_format = c("wav", "mp3"), model_name = "tacotron2-DDC_ph", vocoder_name = "ljspeech/univnet", bind_audio = TRUE, save_local = FALSE, save_local_dest = NULL, ... )

Value

A standardized tibble featuring the following columns:

  • index : Sequential identifier number

  • original_text : The text input provided by the user

  • text : In case original_text exceeds the character limit, text represents the outcome of splitting original_text. Otherwise, text remains the same as original_text.

  • wav : Wave object (S4 class)

  • file : File path to the audio file

  • audio_type : The audio format, either mp3 or wav

  • duration : The duration of the audio file

  • service : The text-to-speech engine used

Arguments

text

A character vector of text to be spoken

output_format

Format of output files: "mp3" or "wav"

service

Service to use (Amazon, Google, Microsoft, or Coqui)

bind_audio

Should the tts_bind_wav() be run on after the audio has been created, to ensure that the length of text and the number of rows is consistent?

...

Additional arguments

voice

Full voice name

save_local

Should the audio file be saved locally?

save_local_dest

If to be saved locally, destination where output file will be saved

exec_path

System path to Coqui TTS executable

model_name

(Coqui TTS only) Deep Learning model for Text-to-Speech Conversion

vocoder_name

(Coqui TTS only) Voice coder used for speech coding and transmission

Examples

Run this code
if (FALSE) {
# Amazon Polly
tts("Hello world! This is Amazon Polly", service = "amazon")

tts("Hello world! This is Coqui TTS", service = "coqui")

tts("Hello world! This is Google Cloud", service = "google")

tts("Hello world! This is Microsoft", service = "microsoft")
}

Run the code above in your browser using DataLab