Customizing Spoken Text on SCG Voice API with SSML tags

Overview

You can control how the Syniverse Voice API plays machine-generated text to your users by using a subset of the tags defined in the Speech Synthesis Markup Language (SSML) specification. This XML-based markup enables you to mix multiple languages, provide pronunciation hints for specific words and numbers and control the speed, volume and pitch of synthesized text.

For Syniverse Voice API, you can send SSML Tags as part of the text string. But first, you must surround the entire string in <speak></speak> tags to tell Syniverse that the string includes SSML. You should use escaped double quotes for tag attribute values.

Here is an example of SSML in the body property

    {"body":"<speak>Hello your passcode is 7, 2, 2, 3</speak>"} 

SSML tags

  • Breaks: Add breaks (pauses) to spoken text
  • Language Voices: Specify the language to use in Text-to-Speech
  • Prosody: Set the pitch, speed and volume of the spoken text
  • Say as: Provide pronunciation hints for words, numbers and dates
  • Sentences and paragraphs: Force the API to recognize sentences and paragraphs when speaking your text
  • Substitution: Replace specific text with a pronunciation of your choice”

Breaks

The break tag allows you to add pauses to text. The duration of the pause can be specified either using a strength duration or as a time seconds or milliseconds.

{"body": "<speak>My name is <break time=\"1s\" />John Smith.</speak>"}

Valid strength values include:

  • none or x-weak (which removes a pause which might otherwise exist after a full stop)
  • weak or medium (equivalent to a comma)
  • strong or x-strong (equivalent to a paragraph break)
{"body": "<speak>To be <break strength=\"weak\" />or not to be <break strength=\"weak\"/>that is the question </speak>"}

 

Voice

The Voice tag allows you to control the language used in  Text-to-speech (tts) calls. Voice tag is carried by the "tts:voice" attribute and allows you to select a voice matching the language in which you are composing your message. Voice can be in female or male gender. See available options for Voice below in the Voice/accent & Language section

"options":{"tts:voice":"Penelope"},
"body":"<speak>Hola, esta es una muestra de una llamada de voz en español</speak>"


Prosody

The prosody tag allows you to set the pitch, rate and volume of the text.

  • The volume attribute can be set to the following values: default, silent, x-soft, soft, medium, loud and x-loud. You can also specify a relative decibel value in the form +ndB or -nDB where n is an integer value.
  • The rate attribute changes the speed of speech. Acceptable values include: x-slow, slow, medium, fast and x-fast.
  • The pitch attribute changes the pitch of the voice. You can specify this using either predefined value labels or numerically. The value labels are: default, x-low, low, medium, high and x-high. The format for specifying a numerical pitch change is: +n% and -n%.

The example below shows how to change the volume, rate and pitch.

{"body": "<speak>I am <prosody volume=\"loud\">loud and proud</prosody>,<prosody rate=\"fast\">quick as a bullet</prosody>and can <prosody pitch=\"x-low\">change my pitch</prosody></speak>"}

Say As

The say-as tag allows you to provide instructions for how particular words and numbers are spoken. Many of these features are automatically detected in speech by the TTS engine, but the say-as command allows you to mark them specifically.

The say-as tag has a required attribute: interpret-as. That attribute must contain one of the following values:

Value of interpret-as

Effect on spoken text

character/spell-out

Spells each letter out, for example: I-A-T-A.

cardinal/number

Pronounces the value as a number. For example, "974" would be pronounced "nine hundred and seventy four".

ordinal

Pronounces the number as an ordinal. For example, "1" would be pronounced "first" and "33" would be pronounced "thirty-third".

digits

Reads the specified numbers out as digits. For example, "747" would be pronounced "seven four seven" and not "seven hundred and forty seven".

fraction

Reads the numbers out as a fraction. For example, "1/3" would be pronounced "one third" and "2 4/10" would be pronounced "two and four tenths".

unit

Reads the specified number out as a unit. The value must be a number followed by a unit of measure with no space between the two. For example: "1meter".

date

Specify how to pronounce dates. See the section below on date formatting below

time

Pronounces time durations in minutes and seconds. For example: 1'30" is read as "one minute and thirty seconds".

address

Reads out a street address with appropriate breaks.

expletive

Replaces the content with a "bleep" to censor expletives. You can use this to automatically substitute filtered swear words.

telephone

Reads out a telephone number with appropriate breaks.

An example:

"body": "<speak>On the <say-as interpret-as=\"ordinal\">1</say-as> day of Christmas,come to <say-as interpret-as=\"address\">123 Main Street</say-as>.<say-as interpret-as=\"spell-out\">RSVP</say-as> for a delicious apple pie.</speak>"

Date formatting

Dates can be formatted in the following ways:

format

How date is read out

mdy

month-date-year (e.g. "3/10/2019")

dmy

day-month-year (e.g. "10/3/2019")

ymd

year-month-day (e.g. "2019/3/10")

md

month-day (e.g. "3/10")

dm

day-month (e.g. "10/3")

ym

year-month (e.g. "2019/3")

my

month-year (e.g. "3/2019")

d

day (e.g. "10")

m

month (e.g. "3")

y

year (e.g. "2019")

yyyymmdd

year-month-day, with optional ? to replace unspecified components. For example: 20190310 or ????0310.

The example below will be converted to "Today is November 15th".

"body": "<speak>Today is <say-as interpret-as=\"date\" format=\"dm\">15/11</say-as></speak>"

Sentences and paragraphs

Sentences

You can wrap sentences in the s tag. This is equivalent to putting a full stop at the end of the sentence.

<"body": "<speak><s>Thank you Kola</s> <s>I will call you later</s></speak>"

Paragraphs

The p tag allows you to specify paragraphs in your speech.

"body": "<speak><p>Hello John</p> 
<p>Can you tell us about yourself</p>
<p>Thank you</p></speak>"

Substitution

The sub tag allows you to provide a substitute pronunciation. The contents of the alias attribute will be read instead.

"body": "<speak>Welcome to the <sub alias=\"United Kingdom\">UK</sub>.</speak>"

 

Voice/Accent & Language Matrix

Below is a list of supported Voice talents and Gender options in different countries for composing your TTS. 

 

Voice Language Code  Gender Language Name SSML Support
Joanna en-US female English-US Yes
Kendra en-US female English-US Yes
Kimberly en-US female English-US Yes
Matthew en-US male English-US Yes
Amy en-GB female English-British Yes
Emma en-GB female English-British Yes
Brian en-GB male English-British Yes
Geraint en-GB-WLS male English-Welsh Yes
Nicole en-AU female English-Australian Yes
Russell en-AU male English-Australian Yes
Raveena en-IN female English-Indian Yes
Gwyneth cy-GB female Welsh Yes
Naja da-DK female Danish Yes
Mads da-DK male Danish Yes
Marlene de-DE female German Yes
Hans de-DE male German Yes
Conchita es-ES female Spanish-Castilian Yes
Enrique es-ES male Spanish-Castilian Yes
Penelope es-US female Spanish-US Yes
Miguel es-US male Spanish-US Yes
Chantal fr-CA female French-Canadian Yes
Celine fr-FR female French  Yes
Mathieu fr-FR male French  Yes
Aditi hi-IN female Hindi Yes
Dora is-IS female Icelandic Yes
Karl is-IS male Icelandic Yes
Carla it-IT female Italian Yes
Giorgio it-IT male Italian Yes
Liv nb-NO female Norwegian Yes
Lotte nl-NL female Dutch Yes
Ruben nl-NL male Dutch Yes
Ewa pl-PL female Polish Yes
Maja pl-PL female Polish Yes
Jan pl-PL male Polish Yes
Jacek pl-PL male Polish Yes
Vitoria pt-BR female Portuguese-Brazilian Yes
Ricardo pt-BR male Portuguese-Brazilian Yes
Ines pt-PT female Portuguese-European Yes
Cristiano pt-PT male Portuguese-European Yes
Carmen ro-RO female Romanian Yes
Tatyana ru-RU female Russian Yes
Maxim ru-RU male Russian Yes
Astrid sv-SE female Swedish Yes
Filiz tr-TR female Turkish Yes
Mizuki ja-JP female Japanese Yes
Takumi ja-JP male Japanese Yes
Seoyeon ko-KR female Korean Yes
Zeina ar female Arabic Yes
Sin-Ji yue-CN female Cantonese No
Zhiyu cmn-CN female Chinese Yes
Tessa en-ZA female South African No
Carmit he-IL female Israeli-Hebrew No
Mei-Jia cmn-TW female Taiwanese No

 

 

Was this article helpful?
0 out of 0 found this helpful

0 Comments

Article is closed for comments.