When it comes to new publishing opportunities, voice interfaces are right up there alongside AI, augmented reality and blockchain. But voice interfaces come with their own set of unique challenges and more than just a few concerns. To help navigate this new frontier, WNIP has put together the ultimate guide to voice interfaces for publishers.
Voice Interfaces – the Complete Guide
What exactly are voice interfaces?
Technically speaking, they should be referred to as voice-user interfaces. That’s VUI to you and me. VUI makes human interaction with computers possible through a voice platform to initiate an automated service or process. Think Tony Stark’s JARVIS, but without the sarcasm.
Amazon uses Alexa, which was inspired by the VUI on board Star Trek’s Enterprise. If you have any doubt as to Amazon’s lofty opinion of their prospects, then the 5,000 employees working on Alexa should set you straight.
Apple uses Siri, which has come under fire for its poor AI despite Apple’s early lead in the market. In fact, the entire project has been rife with divisions and political infighting by all accounts. No wonder then that Apple recently poached Google’s AI chief John Giannandrea to run its machine learning and AI operations.
Google, meanwhile, employs the creatively titled Google Assistant to interact with their products and services.
The industry view: Google Home just about edges Amazon’s Echo, namely because of its strength in search. This is somewhat predictable, considering Google has been in the business of cataloging information since Jeff Bezos was doing little more than shifting copies of Harry Potter.
Stop Press: on Aug 9th, 2018, Samsung surprisingly weighed into the voice market as it revealed its HomePod rival, the Galaxy Home. The smart speaker will allow users to control their smart home using their voice. It will eventually allow Samsung to link up all its devices, such as smart TVs, smart fridges and so on, all using voice control from its artificial intelligence assistant, Bixby. Samsung said it will be sharing more details at its developer conference in November 2018. No launch date has yet been specified.
What is the global penetration?
Huge. And growing. According to market research group Forrester, the installed base of smart home devices in the US alone is set to reach 244 million in 2022. They predict that smart speakers, including Amazon Echo, will account for 68% of the total installed base of smart home devices that very same year. Forrester expects this to accelerate on the release of the next generation of smart speakers, which will be combined with emerging smart home systems.
Sales have been dominated by the US and UK markets, with household penetration standing at 6% and 3% respectively. A recent Gallup poll found that 22 percent of Americans already use devices like Google Home or Amazon Echo.
What publishers need to know
There is a land grab to own skills
In the landscape of VUI, skills are the same as owning a web domain. It’s essentially an instruction to a voice assistant about a specific topic.
When it comes to voice-assistant capabilities, skills can be split into two categories. The first is branded skills which – perhaps predictably – are linked to your brand, and could not be owned by any other company. Skills such as TED’s ‘play the latest TED Talk’ action and the Wall Street Journal’s ‘What’s News?’ fall into this category.
The second category encompasses more generic skills. These could be things like “Alexa, give me the latest publishing headlines” or “Okay Google, give me the latest finance news”. Ownership of these generic skills gives you sole ownership over entire categories, creating a first-mover advantage in the market as brands race to capture skills before they are gone. This can make things difficult for brands looking to harness a particular generic skill. However, there are opportunities for publishers to harness market-specific skills in both the generic and branded categories – like most things, it’s just about finding the right target.
Optimising for Voice Search could boost revenue
According to a recent study conducted by Consumer Intelligence Research (CIRP), Amazon Echo customers spend 66% more than average Amazon customers. Amazon Echo customers spend, on average, $1700 per year, while their counterparts spend $1000. Members of Amazon Prime spend an average of $1300 per year. This means that Amazon can now afford to sell Echo devices at a lower price than originally planned, occasionally taking a loss on devices to gain a greater share of consumer spending. For publishers, this implies that optimising for voice search could result in a revenue boost.
How have publishers been using VUI? (The Good)
Reinventing Customer Experience
In his keynote address at the CMO Digital Insight Summit, Amazon’s general manager of Alexa Skills Fabrice Rousseau spoke about reinventing customer experience through voice technology. “When we moved from desktop to mobile we didn’t bring the desktop experience to mobile, we invented a very specific mobile experience,” he pointed out. “When you move from mobile to voice don’t bring your mobile experience. Just invent an experience that is unique to voice.”
Building your brand with skills
It’s important to note that, as far as most publishers are concerned, VUIs are still in their experimental phase. Despite early successes with branded skills and flash briefings, VUIs still operate at a fairly low level – following commands to play music or read news headlines, for example. That said, many publishers are already working on plans for expansion, and with the land-grab to own skills still underway the future successes of VUI belong to those who move first.
Most major news outlets have at the very least developed a daily briefing, with varying degrees of sophistication and success. NPR’s News Now is widely considered the leader of the pack, with five minutes of news delivered by an NPR broadcaster. One factor that gives News Now the market edge is that it updates every hour, while flash briefings of a similar quality from the likes of BBC News and CNN aren’t quite so up-to-date. In addition to flash briefings, outlets such as The Guardian have made all of their news, reviews, podcasts, and comment pieces freely available on voice platforms, while the Daily Mail has instead developed a branded ‘news on demand’ service that can only be accessed by digital MailPlus customers paying the £9.99 monthly subscription fee.
Aside from daily briefings and newscasts, some publishers are developing more brand-specific skills and content. HarperCollins Christian Publishing, for example, has developed a ‘Devotionals’ skill featuring short uplifting passages read by authors as an extension of their existing social media ‘daily devotionals’. Other publishers have combined interactive elements with their regular VUI programming, with both The Washington Post and Financial Times launching quiz skills based on news coverage and public figures.
In 2017 BBC R&D began to experiment with native VUI content, and released an ‘interactive science fiction comedy story’ called ‘The Inspection Chamber’ for Amazon Alexa and Google Home. In 2018, they plan go further down this route, and will look at developing more children’s VUI content as part of their remit as a public service. For other publishers, such as Bayerischer Rundfunk, the next logical target is to build a journalistic dialogue between users and voice platforms. But what would this ‘dialogue’ look like? For one, it would allow customers to interrupt a newscast to learn more about particular items and dictate the direction of their own newsfeed.
The personal touch
One issue that publishers have been running up against with VUIs is voice-assistants’ total lack of personality. At the Financial Times, head of FTLabs Chris Gathercole and his team have been using Amazon Polly to convert existing text articles to audio that is then delivered by ‘Artificial Amy’. While ‘Amy’ learns quickly and is cost effective, her lack of human inflection can be off-putting and can steamroll the humour or nuance of a piece.
On top of the comprehension issues this obviously brings up, automated voices are often either boring or straight up disturbing, both of which put users off. Gathercole believes that a blend of artificial and human voices could temper the issue, with a voice actor reading parts of the text and a computerised voice contributing further snippets.
Another solution to the problem, as Google’s Peter Hodgson pointed out at the Smart Voice Summit in Paris earlier this year, is to build a brand-appropriate persona for your voice assistant. Quartz, for example, originally used Alexa to deliver their Daily Brief but soon found that they had more success using a variety of voices in the more conversational tone they had established with their mobile app. Quartz’ Alexa newscast is now read by a pair of AIs called Kendra and Brian, which present headlines in a more playful tone than Alexa is capable of. Similarly, the Telegraph initially used Alexa’s pre-programmed voice to deliver their ‘5 by 5’ Google Home show, but now have their journalists read and discuss the news. Since they made the switch in late 2017, the Telegraph has seen an increase in the number of subscribers tuning in.
Where to begin? There are all manner of privacy concerns surrounding the ownership of devices that are essentially constantly eavesdropping on your home. While VUIs can bring to mind the sleek aspirations of sci-fi movies and futuristic projections, there is also the not entirely unfounded fear that they could very easily be used to monitor and manipulate ‘Black Mirror’ style.
Customer advocacy group Consumer Watchdog shed light on some of these concerns in a recent study of new patent applications for Amazon Echo and Google Home. “These patents show that smart devices target moments in between screen time to monitor sleep habits, listen in on dinner conversations, and track when users shower. Access to this data can flesh out Google and Amazon’s profiles of their users in order to help them more accurately server targeted ads,” the study claims.
Consumer Watchdog’s assessment is unequivocal; “Digital assistants like the Amazon Echo and the Google Home greatly expand the collection of personal data, magnifying the risk that someone will learn something about you that you would rather keep private,” they conclude. So, there isn’t a whole lot of wiggle room for interpretation here. In an article written by Sapna Maheshwari for the New York Times, Consumer Watchdog president Jamie Court doesn’t skirt the issue; “When you read parts of the (Alexa) applications, it’s really clear that this is spyware and a surveillance system meant to serve you up to advertisers,” he says.
The thing is though, none of this is illegal. While future developments in VUI seem to have enormous potential for corruption and misuse of data, it is still early days. Alexa was launched just over three years ago in November 2014, and we have not yet seen the full scope of the technology’s potential. As VUIs – and our understanding of their implications – grow and change, there may be safeguards put in place to prevent misuse of data. With the recent issues surrounding companies such as Facebook and Cambridge Analytica, however, these concerns cannot be dismissed out of hand.
The Final Word
While VUI is still in its infancy, many publishers are making rapid advancements and future successes will belong to those who move quickest. Despite user privacy concerns and some slightly creepy hardware glitches, VUIs present an excellent opportunity for publishers to reach a wider audience in new ways and have the potential to increase revenue.
At a glance: According to Canalys, the number of smart speakers in use will come close to 100 million by the end of 2018 and the market will more than double again to hit 225 million units by 2020. But that’s not even counting the millions of devices consumers can download Alexa onto as both iOS and Android apps. Similarly, you can find Google Assistant on 400 million devices. And more companies are getting into the fray.
At a glance: Voice interfaces like Alexa and Siri have been adopted faster than almost any other technology in history — even surpassing the smartphone in their four-year trajectory. Approximately one-quarter to one-third of the U.S. population already owns a smart speaker like the Echo or Google Home; the global number of installed smart speakers is going to more than double to 225 million units in two years.
At a glance: It’s not a time to exaggerate the current capabilities of voice search, but the signs suggest its rise is inevitable. According to a recent report, two million children are now using smart speakers in the UK alone. Over in the US, 47.3 million adults have access to a smart speaker. With a digital assistant waiting on every smartphone, the number of people using voice search is possibly much more significant than many of us can comprehend.
At a glance: Consumers are captivated by devices that attempt to emulate a voice’s ineffable human qualities — through digital AI assistants, home agents and voice-enabled devices. A PwC survey found that 72% have used voice-enabled products and services, most often in their homes. Research firm Ovum (via CNET) predicts that by 2021, more than 7.5 billion voice-activated assistants could be in use around the world — roughly the number of people on the planet.
At a glance: What this means in real terms is that office equipment can now have Alexa features built-in, without the need for an Echo (or other) assistant device. Amazon has a distinct advantage. None of the other major voice assistants has, as yet, been really exploited for its enterprise potential and as such, it’s likely to have a significant head start, and it looks like it is ready to capitalise on it.
At a glance: Larry Ellison demonstrated a “revolutionary” voice interface for getting information from Oracle’s cloud-based business applications and he applied analytics to a data warehouse he built on the fly during his second Oracle OpenWorld keynote. The company’s larger goal: Make its Oracle Fusion Cloud applications easier to query and use, while automating key business processes.
At a glance: To request the latest news anytime, users (UK only at this juncture) can simply say “Alexa, show me the latest in… Technology etc….” The content is deep-linked from Wochit, which includes videos created by Wochit’s in-house producers, as well as content from the company’s extensive list of publisher partners. This update from Wochit has also been incorporated in the latest version of the Amazon Dot and Amazon Echo.
At a glance: To make the user experience realistic, XD can now trigger speech playback when it hears a specific word or phrase. This isn’t a fully featured natural language understanding system, since the idea is only to mock-up what the user experience would look like. Soon, Adobe announced during a recent event, developers will also be able to use an actual Echo device from Amazon to test their prototypes on the real hardware.
At a glance: Despite a firestorm of privacy scandals that have engulfed the company over the past year, Facebook wants to put its own smart speaker in your home. Called Portal and Portal+, the two devices are geared towards video calling and feature A.I. technology that can automatically follow a person as they move throughout a room and remove unwanted background noise during a call. Portal features a 10-inch display and costs $199, while Portal+ features a larger 15-inch pivoting display and costs $349. Both devices are available for preorder online and will begin shipping in November.
At a glance: It’s obvious that talking to a Voice User Interface (VUI) is still not quite like talking to a human. Many may argue that this isn’t necessary, and that being able to bark fragmented command words at a disembodied voice gets the job done. For now maybe that’s all we need, but let’s think about the many future improvements that could be made to voice technology. They’re on their way…
At a glance: Years usually pass before the world realises a technology breakthrough actually happened and it catches on; the voice user interface might just have arrived at that watershed moment. Of those who currently own a smart home device, 65% plan to purchase more. Looking at the speakers themselves, usage is up, the average user interacts with the device for 72 minutes on the weekend and 65 minutes during the week, while 81% of users report using voice-command searches for real-time information, such as weather and traffic conditions, during a typical week.
At a glance: The more devices people have, the less concentrated their interactions with smartphones are, according to an NPR/Edison Research study of smart speaker owners from early this year. These new devices and features can accelerate Amazon’s ad platform growth even more, and potentially change the way digital ads work in a world where more and more interactions have to go through voice-first devices.
At a glance: Amazon last Thursday unveiled a slew of new devices that will further incorporate its voice-activated assistant Alexa into people’s homes, including a subwoofer, a smart plug and even a microwave. The tech giant is also updating its family of Echo smart speakers, the devices that have helped the e-commerce giant establish an early lead among voice-based interfaces.
At a glance: Devices such as Amazon’s Alexa-powered speakers, Google Home and Apple HomePod are increasingly delivering news flashes and summaries, and giving users the option to get more in-depth news, just by asking. For beleaguered news organizations, voice could be the new channel to connect with consumers seeking updates or specific information on demand.
Sept 13th: Is Amazon killing the Skill as we know it?
At a glance: On Bloomberg Live’s The Value of Data event, Rohit Prasad, Alexa’s vice president and head scientist, confirmed in an interview with Tom’s Guide that Amazon intends to eliminate the need for Alexa users to enable and call upon individual skills. Specifically, Prasad is working towards an Alexa that can parse its abilities to find the one that best addresses your request.
Sept 13th: Google Home Max review
At a glance: This is the big beast of the Google Assistant smart speakers. It’s £399, and sits between the Apple HomePod and Sonos Play:5 in cost. It sounds bigger than the HomePod and is light years smarter than the Play:5. The Google Home Max can control all of the most popular smart home gear, like Philips Hue and LIFX lights, Tado heating gear and Netgear’s smart security cameras.
At a glance: No matter how many functions and capabilities you add to an AI assistant, you’ll only be scratching the surface of the list of tasks that a human brain can come up with. And voice assistants suffer from the known limits of deep learning algorithms, which means they can only work in the distinct domains they’ve been trained for. As soon as you give them a command they don’t know about, they’ll either fail or, worryingly, start acting in erratic ways.
At a glance: Thanks to the way Alexa handles requests for new “skills”—the cloud applications that register with Amazon—it’s possible to create malicious skills that are named with homophones for existing legitimate applications.
At a glance: Since becoming a finalist in the BookTech Company of the Year Awards in 2016, Novel Effect has had an impressive trajectory. The Seattle-based startup, which uses voice recognition technology to sync music and sound effects with children’s books as they are read aloud, has been through the SXSW accelerator; secured $3 million in funding; won a Webby Award; and secured partnerships with major publishers including Hachette and Simon & Schuster. Key quote, “With the rapid adoption of voice assistants in homes, classrooms, and work all across the world, coupled with the billions of dollars being invested by the largest tech players that are propelling the capabilities and accuracy of the voice interface, the opportunities are growing.”
At a glance: You can now speak to Alexa on your Amazon Echo by talking to Cortana on your Samsung smartphone. Amazon and Microsoft are making their virtual assistants compatible in the US as they take on Alphabet’s Google Assistant and Apple’s Siri. How well this new marriage of voice recognition technologies works will be open to users’ feedback.
August 22nd: BBC Good Food finds a use for voice and Alexa
At a glance: The publisher, part of Immediate Media’s remit and part of the commercial wing of the BBC, has opened up 11,000 recipes to Alexa users with a bespoke Skill. On the Skill, users can browse recipes by ingredients, dishes, diet types, meal speed, meal difficulty, meal prep time, cuisine, course and chef.
August 21st: How voice will impact on search and social media
At a glance: It would be fairly straightforward for any of the big tech platforms to add voice as an option, and this is an innovation we could see in the next twelve months. What also might happen is the emergence of voice-based social networks. The leading network at the moment is Hear Me Out which is a Twitter-style platform that enables users to create and share 42-second messages.
August 17th: “Alexa, why is nobody voice shopping?” At a glance: According to reports, there are about 50 million “smart speakers” laying around peoples’ homes. Only 2% of people who own a “smart speaker” have ever used it to buy anything. And of the tiny number who did, 90% never used it again. Is voice a victim of its own hype?
At a glance: Samsung has announced its entry into the smart speaker market with the Galaxy Home. It’s a high-end speaker that’s meant to go head-to-head with Apple’s HomePod, while standing apart from competitors like Amazon’s Echo and Google’s Home with a promise of higher-quality audio. Samsung said the speaker is meant to combine “amazing sound and elegant design” and will link up with all its smart home devices.
At a glance: Voice-activated technologies will facilitate the smart office, where IoT devices will transform how everything works. Office management will be transformed by intelligent devices, which will increasingly enable voice interaction for finding out basic facts about office equipment, including location, service status, who the users were and others. Voice technology will similarly transform customer service, training, data access, identification and authentication and nearly all aspects of IT. And it’s coming soon.
At a glance: Conversation design is the notion that experiences should be crafted relative to the interface rather than an afterthought in order to harness the same success of visual interfaces. In short, voice is useful for giving quick, high-level information and pulling in detail only when prompted. Too much detail is its death knell.
At a glance: Home voice assistants or smart speakers are still in their infancy as a consumer and revenue proposition, but publishers are stepping up their efforts to hire and create content for them anyway, seeing the rapid adoption rate of Amazon Echo, Google Home and their kin and the fact that people are using them more over time.
At a glance: According to research firm eMarketer, the number of Americans using a voice assistant device is forecast to grow 129% to 36 million this year. The tipping point isn’t here yet but it’s coming…..
At a glance: There are two distinct categories of players in the voice-controlled battle for the home. In the blue corner: big tech companies that want to gather more data about consumers to further their core business. In the red corner: more traditional speaker manufacturers like Sonos that need to add more intelligence, in order to keep up with consumer expectations. The winner is likely to be the former, not the latter.
At a glance: If the original Google Home was the speaker that proved Google Assistant is a worthy Amazon competitor, the Google Home Mini is the one that will get people hooked. The smaller Google Home has all the same smarts as its larger counterparts, but at less than half the price. It’s difficult to see how this doesn’t shake out as a win for Google.
At a glance: By 2022, 70% of American homes are expected to have smart speakers, dominated by Amazon Echo, says Morgan Stanley analyst Brian Nowak in a new research report on Alphabet. There will be 1.3 times as many Echoes as Google Home speakers in homes by then, he predicts, giving Amazon a huge advantage when it comes to the growing world of voice commerce.
At a glance: Voice assistants like Amazon’s Echo and Google Home Assistant are becoming a massive trend worldwide, so much so that Accenture predicts digital voice-assistant device ownership will reach one third of the online population in China, India, the US, Brazil and Mexico by the end of this year. In particular, India will soon boast 135 million users of voice bots, making it the world’s hotspot for this new interface.
June 12th: Get Alexa on Your Apple Watch With This App
At a glance: There are times where you might not be around your Amazon Echo to access its features, especially smart home features. Voice in a Can works on its own without your phone, so you could be out for a run and lock your doors or start your coffee machine. Basically, it gets you one step closer to that Inspector Gadget lifestyle we’re all working toward.
At a glance: Given the challenges that many people have had with the accuracy of Siri’s recognition, this more simplistic approach is a good fit for Apple. Essentially, you’ll be able to do a lot of cool “smart” things with a much smaller vocabulary, which improves the likelihood of positive outcomes.
At a glance: Amazon has launched a beta version of an interface called CanFulfillIntentRequest, which will enable Alexa to suggest the perfect skill for users when they don’t know which one to ask for.
At a glance: A team of academic researchers has tested smart-home assistants Amazon Alexa and Google Home, finding it possible to closely mimic legitimate voice commands in order to open a rogue app instead of the legitimate one, thus hijacking the connection. In short, voice-squatting.
At a glance: A Portland family contacted Amazon to investigate after they say a private conversation in their home was recorded by Amazon’s Alexa and that the recorded audio was sent to the phone of a random person in Seattle, who was in the family’s contact list. In short, Alexa is still grappling with fundamental issues of privacy.
At a glance: There are some important considerations developers should take on board to overcome the unique challenges posed by delivering a positive brand experience in a voice environment. To add genuine value and avoid creating vanity apps, it’s imperative that businesses put the needs of their customers first.
At a glance: Many commentators have suggested that “Duplex” is not only strange but entirely unethical, and that it could signal an important moment in the acceptance and use of artificial intelligence.
At a glance: Virgin Trains has become the first train operator in the world to sell its tickets through Amazon Alexa. Rail travellers can now book Advance Single tickets and sort their trips with one simple voice-based transaction with payment completed through Amazon Pay. Rumours that the VUI suffers from delays are unfounded.
At a glance: ‘Sound-to-Meaning’ startup SoundHound has received $100 million in new funding from Chinese tech giant Tencent (and others) to give every brand a voice. And not just any voice: their own voice.
April 27 2018 Study finds Google Assistant is smarter than Alexa
At a glance: For the second year in a row a study has found that Google Assistant is ‘smarter’ than its competitors, generating the most correct responses to questions across the board. Somewhat surprisingly, second place goes not to Alexa, but to Microsoft’s Cortana.
April 26 2018 Amazon announce new skills for Alexa.
At a glance: Amazon are developing new capabilities for Alexa, including an internal memory that will allow Alexa to recall previous conversations or information you have provided. They will also improve Alexa’s context capability, so that follow up questions don’t have to be asked separately.
April 25 2018 Amazon launches Echo Dot for Kids
At a glance: Amazon have launched a new Echo Dot product specifically aimed at children, which comes equipped with new Amazon FreeTime Unlimited software. Amazon FreeTime Unlimited features parental controls, a bedtime limit, and child appropriate programming and responses. It also responds positively when requests include the word ‘please’.
April 24 2018 Amazon make their Alexa app iPhone X compatible
At a glance: Amazon have updated their Alexa iOS app to be optimised for iPhone X, including bug fixes and ‘performance enhancements’.
April 23 2018 Google update their podcast capability
At a glance: Google have updated their podcast capability for VUI devices, so that users can begin listening to a podcast on your phone and then pick up where they left off later with Google Home.
At a glance: LG have joined the ranks of the electronics companies whose devices can support both Alexa and Google Assistant, preventing customers from being locked in to one voice assistant.
16 April 2018 Adobe acquires voice interface platform Sayspring
At a glance: Adobe has announced that they have acquired Sayspring, a startup that helps clients develop voice interfaces for Amazon Alexa and Google Home.