Digital Innovation
3 mins read

Amazon Polly’s neural text-to-speech allows publishers to turn articles into realistic audio

wall e in real life
Getting your Trinity Audio player ready...

Amazon has added two new features to Polly, its cloud service that converts text into speech. The features are Neural Text-To-Speech (NTTS), and Newscaster Style. 

NTTS is an enhanced and realistic voice experience that learns to speak by listening to recorded human speech and copying it, like the way human children do. As a result, it can understand the differences in speaking styles and generate speech in an expressive and lifelike way.

“Closer than ever” to human voices

Julien Simon, Global Tech Evangelist at Amazon Web Services writes in a blog post, “NTTS delivers significant improvements in speech quality. It increases naturalness and expressiveness, two key factors in synthesizing lifelike speech that is getting closer than ever from (sic) human voices.”

Listen to a sample here:

The other new feature, Newscaster Style, makes narration sound very realistic whether it’s reading news articles or blog posts. According to Simon, Newscaster Style tailors the style of narration according to the content. So it will read a newscast, a sportscast, or a university class lecture in the style that human ears are accustomed to hearing each of them. 

Thanks to Polly and the Newscaster Style, their readers (or should we say listeners now?) can enjoy articles read in a high-quality voice that sounds like what they might expect to hear on the TV or radio. Adding Amazon Translate, they can also listen to articles that are automatically translated to a language they understand.

Julien Simon, Global Tech Evangelist at Amazon Web Services

In January, Amazon had rolled out Newscaster Style to its Alexa enabled devices for daily briefings and Wikipedia snippet narrations.

Andrew Breen, Senior Manager with the TTS Research team at Amazon had said in a statement, “The ability to teach Alexa to adapt her speaking style based on the context of the customer’s request opens the possibility to deliver new and delightful experiences that were previously unthinkable. We’re thrilled that our customers will get to listen to news and Wikipedia information from Alexa in this new way.”

NTTS is now available for 11 voices which include 3 UK English voices, and 8 US English voices. The Newscaster Style is available for two US English voices. All sets include both male and female voices.

Up to 1 million characters for Neural Text-To-Speech voices per month are free for the first 12 months, starting from the first request for speech (standard or NTTS). Here are the pricing details

“Critical gateway to media”

This is an important update for publishers who are experimenting with the audio format to serve their readers. Voice represents a huge opportunity. Reuters Institute’s November 2018 The Future of Voice and the Implications for News report states that penetration of voice-activated speakers is growing rapidly and is now reaching mainstream audiences. 

According to the report author and Senior Research Associate at Reuters Institute, Nic Newman, “Voice could become a critical gateway to media going forward.”

Newman found that while news updates on smart speakers were actively used, they “were not greatly loved.” One of the reasons listed by Newman for this was the use of synthesized voices (text to speech), which many find hard to listen to. It appears that NTTS and Newscaster Style may be a solution to this problem. 

“Produce audio content efficiently”

Publishers like Gannett, The Globe and Mail, Success Magazine, TIM Media, Encyclopedia Britannica, and nonprofit education tech company CommonLit are some of the earliest users of Newscaster Style. 

The trends increasingly show that consumers are gravitating toward audio content.  With the exceedingly high quality speech that Polly now offers, we’re even better equipped to deliver these exceptional listening experiences to our audience.”

Stuart Johnson, Owner and CEO of Success magazine
Success Magazine, Digital Edition using Amazon Polly NTTS

Amazon Polly Newscaster enables us to provide our readers with more features to further their experience with our newspaper. This text-to-voice feature from AWS is miles ahead of anything we’ve heard to date.

Greg Doufas, Chief Technical and Digital Officer at The Globe and Mail

Scott Stein, Vice President of Content Ventures at Gannett says, “We strive to innovate and bring our audiences news and content wherever they are. With more than 100 newsrooms across the country, it’s important for Gannett | USA Today Network to produce audio content efficiently. 

Services like Amazon Polly and features like its Newscaster voice help us deliver breaking news and original reporting with increased speed and fidelity worthy of our brands.