Top news stories for Episode 6 (August 10, 2017):
1a) Does any aspect of this episode, so far, hurt Google's voice technology aspirations in any way?
1b) Free speech issues abound, as both the manifesto itself along with Google's act of termination as a response are "speech." How important is free speech to the success of voice-first technology?
2) In more positive Google news, Google has rolled out a program where users can preview new features before they're rolled out to everyone.
4) From The Wall Street Journal: The End Of Typing - The Next Billion Mobile Users Will Rely On Voice And Video
6) Kim Komando provides list of how Alexa can be deployed in the kitchen, signaling just how deep Amazon has penetrated the marketplace with the Echo ecosystem.
Panel for Episode 6 (August 10, 2017):
Dr. Ahmed Bouzid is Founder and CEO of Witlingo. Dr. Bouzid is also co-founder and Director of the Ubiquitous Voice Society, a non-profit organization dedicated to the mission of evangelizing the emerging voice interface, and author of two books on Voice User Interface design.
Lisa is an experienced designer of speech-enabled, multimodal applications on various platforms. She is also a member of the Board of Directors of Stanford Professional Women.
Jonathon Myers is CEO and co-founder of Earplay. Earplay has created a new storytelling medium with voice UI/UX, which harnesses the power of interactive conversation to connect audiences with characters they love.
Bradley Metrock: [00:00:09] Hi, and welcome back to This Week in Voice, Episode 6, for August 10, 2017. Our sponsor for This Week in Voice is VoiceXP, blazing the trail in voice technology. VoiceXP is taking the lead in developing Alexa skills for the best brands in the world. With VoiceXP, all you have to do is say it to revolutionize your marketing strategy.
[00:00:33] We're very pleased to be joined today by three awesome guests. The first is Lisa Falkson, an experienced designer of speech-enabled multimodal applications on various platforms. Lisa is also a member of the board of directors of Stanford Professional Women.
[00:00:54] Our next guest is Jon Myers, CEO and Co-Founder of Earplay. Earplay has created a new storytelling medium with Voice UI UX, which harnesses the power of interactive conversation to connect audiences with characters they love.
[00:01:15] Next: Ahmed Bouzid, CEO of Witlingo. The Witlingo team is conducting a workshop on August 16th entitled "Learn How to Design an Alexa Skill on Voice User Interface Design." People in the DC area can learn more by looking at ThisWeekInVoice.com - we'll have the link posted for the meetUp. Also, the DC chapter of the Ubiquitous Voice Society is running a meetup on September 16th on "Voice First and Social Isolation" - we'll post that link on ThisWeekInVoice.com as well. So with that we'll get to the news.
[00:02:02] This has been a very interesting week, as controversy erupted with the Google Engineer Manifesto (as it will probably always be referred to) leak. And now it's turned into a big political deal with conservatives outraged that this guy was fired over free speech, and so on and so forth. Very interesting. Lisa, I'm going to start with you on this. Does any aspect of this episode - from the guy taking all the time to write that 10-page manifesto, to it propagating all over Google, to it leaking, to the reaction from the public, to the termination - does any of it affect Google's voice technology aspirations at all?
Lisa Falkson: [00:03:00] You know, I have two engineering degrees - one from Stanford, one from UCLA - and I've been subjected to this type of attitude since I was about 18 years old and really interested in math and physics. I don't think it's a surprise to anyone that there are folks who feel this way. I think what's happened is that Google has has gotten some press about issues with pay equality before, and that brought these gender issues to the forefront for them. People resent when there seems to be an attempt to favor a minority if they're part of the majority, which this gentleman is. What's important to point out is that no one is saying we should go on some kind of crazy hiring spree and just willy-nilly hire women and minorities who are underqualified for jobs. What we are saying is that it's actually harder to find these candidates, and you should put some effort into that. There's a lot of data showing that more diversity in the workplace actually improves the product. There's data that shows having women on a board of directors makes a company more successful. There is enough real hard data to show that a company is actually better off with more diversity than if it's dominated by white males.
Jon Myers: [00:05:06] I completely agree with everything Lisa was saying. I think it takes a concerted effort to fight against the sort of biases that are out there. In some ways it's good that this discussion is happening; of course it's crappy that it happened this way. But it's important to bring this out into the light - talk about it and point out that opinions like those that were in the manifesto are wrong. I think the entire notion that an internal memo is protected by free speech is kind of silly. He didn't write a blog post; he wrote an internal memo that was meant to share opinions within the company, and that's a very different story.
Ahmed Bouzid: [00:06:03] I definitely agree with both Lisa and Jon. I'll just add that I think it goes to show how disconnected from the humanities and social sciences Silicon Valley engineers are ...
Lisa Falkson: [00:06:17] Oh, I agree with that, for sure!
Ahmed Bouzid: [00:06:17] ... because if you read that screed doc, there's a lot of hubris, a lot of bad thinking, a lot of ignorance. Clearly the guy has not read literature in the social sciences and basic philosophy. Let me just quote you from John Stuart Mill who wrote an essay called "The Subjection of Women" in 1869, a hundred years ago: "The legal subordination of one sex to the other is wrong in itself, and now one of the chief hindrances to human improvement; and that it ought to be replaced by a principle of perfect equality, admitting no power or privilege on the one side, nor disability on the other." This is in 1869, all right? So this engineer, this white man, was speaking on issues he doesn't know about. If he were to go and do some research - a lot of thinking and debate has been going on for a long time. Research shows conclusively that the black man and white man are equal, and the woman and the man are equal in terms of what they can do.
Bradley Metrock: [00:07:31] Moving on to Story Number Two this week, in a little bit more positive Google news: Google has rolled out a program where users can preview new features before they're rolled out to everyone. This is cool. I wish this were true across the board. Jon, what do you think of Google's new program?
Jon Myers: [00:07:47] It's always what sets Google apart from others - they like to beta test, they like to just get stuff out and enable people to use it as soon as it's past an alpha stage. I've always been a huge Google fan - I had one of the earliest Chromebooks, the Cr-48. So to me it's exciting because I get to try stuff. I think from the developer point of view it makes things a little more difficult, which is part of what it means to develop software for your new platform: to be able to deal with users who are accessing things a little earlier than others. So there's that perspective. But I think that's essentially how things worked out when it was Android and iPhone, right? With the iPhone platform they rolled things out at a later testing state. Android was always allowing people into betas, allowing developers early access to features, that sort of thing - so it kind of follows along with the Google pattern.
Ahmed Bouzid: [00:08:57] Yes,absolutely, I agree with Jon - especially with voice because it's a tricky medium and our collective level of experience is still in the early stages, and we really need to get things out. And we're learning quite a bit as we go. I think it's a smart idea; I like it a lot. Once it's beta, we can invite people to a simple link before that rigmarole you have to put people through to be able to test the skill. At Witlingo we serve enterprises and they like to kick things around. It also helps us in pitching - when we pitch to somebody we want to demo them a skill that is branded theirs and it helps us quite a bit. So I think this is definitely very useful.
Lisa Falkson: [00:10:00] I agree. And the nice thing about this is that there's sort of a willing beta group - a group of very motivated people who would like to see these features early. And I think that's your best audience. So if you find issues with what you're launching - with your best, most willing, most interested participants - then that really means there are issues and they should be addressed right away. I think it's a really good way of hitting stuff early with with your most dedicated fanboy/fangirl type folks.
Bradley Metrock: [00:10:39] Our third story this week is about Bionik Laboratories Corporation, which has produced a full-blown exoskeleton that is controlled by Alexa. This is really kind of wild. And Lisa, since you have a lot of experience with robotics, I'll start with you. What do you think?
Lisa Falkson: [00:10:57] I think any time you start doing anything that's medical or physical, there's a lot more opportunity for error, and for error to be catastrophic. When I say that, I don't mean you shouldn't do it. But I'll give an example of when you should not do it. I saw a demonstration many years ago, when I was in grad school, from a company that was using speech for surgery. They had a robotic arm doing surgery and it was controlled via speech. The first thing I thought was: this is ridiculous, what if you have a missed recognition? It's a nightmare! Then I found out that there was nothing crucial that was being performed by the robotic arm - it had nothing to do with cutting or any of that - and that it was actually controlling the camera. And even when it controlled the camera, any time it heard any speech energy at all, it would pause to make sure it heard the command correctly before it would move. So there was no chance of moving the camera in any direction where it would tear tissue or go the wrong way. So when you're looking at these types of situations where there's a physical response that happens based on a voice command - and that could have some catastrophic results - then you just have to be more careful about making sure you're right, making sure your commands are correct and confirming them. It's a bit of a dialogue challenge, I would say, even more than a technical challenge - because recognition is just going to work how it works. But it's really dialogue-wise, how do you handle it? And then the integration problem between the Alexa device and the exoskeleton. I know for example they're saying you have to be near the device because it's not integrated in any way with the skeleton itself, so what if you move out of range, and things like that. There are just a lot more considerations you have take into account when you're designing this way.
Jon Myers: [00:13:31] I'm fascinated by this. It's interesting that voice is being put inside just about everything I can think of. I feel like it's moving along a little faster than what it's capable of - because I feel like in front of this we need to have voice differentiation and a little bit more privacy. All I could think about when I read that article was, what if someone besides the wearer just walked up and started trying to control that suit? How confusing that would be! You know, we talk about how it's so funny when we're talking and Alexa hears us. It's cute, and we turn off the mic. Or maybe the TV triggers something. Well, I look at that suit and I'm like ... ?? Now, I think that the people who are creating that are not going to commercialize it until it's ready. But what I'm getting at is that the voice differentiation, voice ID-type technology needs to come along before things like that are possible - before personal calendars inside an office work well - enterprise sort of things. So anyway, that's where my mind went - and it ties into what you're saying, Lisa, about how we need to design scenarios so that only the wearer, for example, would have control.
Lisa Falkson: [00:14:48] Yes, absolutely. Absolutely. And that's authentication, identification, and verification - so I know who you are, and I agree that you are definitely you - I've identified you as yourself and you've done some additional verification step to tell me you are authorized to do this action.
Ahmed Bouzid: [00:15:11] I definitely agree. In situations where the consequence of an error is a lot more than simply an annoyance, it's imperative to ensure that the error rate is very, very low. And it's always like that. The last 10 percent of anything is the hardest work - as much work as the first 90 percent was. There's a long way to go to get to the point where you can do these things safely.
[00:15:42] But I think there's definitely a subset of these cases where error would not be catastrophic. And then there's the voice ID - like Jon was saying, it can't be where anyone can control it. The promise is there, and it's great that these folks are tinkering to solve these kinds of problems.
[00:16:31] And I think the excitement in anything like that - whether it goes somewhere real or not, or becomes a product or not - is the set of problems they will tackle, surface, and try to solve. That's what innovation is all about - it's about solving problems in some innovative way.
Bradley Metrock: [00:16:49] Yes, I think it really underscores a big theme of voice technology, which is accessibility. Voice technology is opening doors in so many ways, and this is another one. It's exciting.
Lisa Falkson: [00:17:05] Absolutely. There are cool teams both within Google and Amazon that just deal with accessibility, down to the point of the sadly failed Fire Phone which had haptic feedback; you also have that on the Amazon shopping app. It's not just for speech; it's across the board. I think a lot of people are looking at accessibility. But speech to me is one of the best things and easiest things that we can do. There are videos that show people who are blind that can do their shopping on Amazon. Having the primary modality be voice immediately opens things up to whole groups of people who can then do certain things they couldn't before. It's a pretty amazing step. Not just handicapped people, but also children with certain types of learning disabilities and autism relate very well to speech systems, and have really improved their vocabulary and ability to speak or relate to people through use of Siri, Alexa, Google. And I just think that's amazing.
Ahmed Bouzid: [00:18:50] And I think the category of people who have accessibility issues is not a static one - meaning you or I or anybody who's listening could become passive by breaking a bone, and all of a sudden they can't move. And they have to learn voice. So I think accessibility is huge, and I'm glad you raised it. It's not spoken about as much as it should be when it comes to voice. But it just opens up the technology to people who are either permanently or temporarily incapacitated.
Lisa Falkson: [00:19:30] Yes, and one thing that's so great is ease of learning. You know, people need to be taught how to use an iPhone - they usually have to play with it a while, learn how to use that tiny keyboard with their fingers; my mom can never really get the hang of it. But you don't need to be taught how to speak. We all know how to speak. Children speak before they type, before they read. And I think that's what's so special about this and makes it so accessible - to children, and to the population of people who don't know how to read or write. This enables them to do certain tasks, and I think that's great.
Bradley Metrock: [00:20:32] Most people consider accessibility a great thing, so I look forward to that former Google employee writing a manifesto about how he doesn't like it. (I had to throw that in there.)
Moving on! To Story Number Four. This one from The Wall Street Journal - fascinating article here and very well circulated - "The End of Typing: The Next Billion Mobile Users Will Rely on Voice and Video." Really interesting piece. Ahmed, what did you take away from it?
Ahmed Bouzid: [00:21:13] So, as Lisa was saying just now, voice is the most natural interface; I think it's the only interface ever that we didn't invent - meaning it's not something artificial that was put together by engineers and that we have to learn. It's an interface that has to bend to us. We need it to speak naturally, take the burden, take care of things, and come back to solve things for us and do it in a way that is usable. I'm glad that for right now we're going away from having to learn an interface, and the progression has actually been growing for a long time, ever since the Mac came out, making things more and more usable. This is the next iteration, the next level. I think what's happening is a concept called the Gutenberg Parenthesis. I was introduced to this by a professor from Southern Denmark University, named Thomas Pettitt. If you look at the span of humanity's existence, the printing press represents a tiny fraction of our existence in terms of communication. We communicated orally for a long time, and only in the last 500 years have we had a text that was static, that was written, that people could read, and that would last for generations.
[00:23:02] Pettitt's Parenthesis draws a parallel between the pre-print era and our own Internet age. If you look at the way teenagers today interact with each other, they use a lot of emojis and their verbal transactions are very truncated. And they use videos. Communication is not through written text sentences and so forth, but through these diffuse audio/verbal interactions. It's interesting how voice is coming in and adding merit to the argument that we're going away from the Gutenberg world - where there was text, and there was one centralized source of truth - and toward something that is a lot more diffuse. I think it has big pluses opening up for people who were blocked before because they were not literate.
[00:24:09] But there's always a dark side - the dark side now being, what is truth? What is fake news and not fake news? You don't have one newspaper of record that everybody agrees is the place where facts are facts, and scientists are right now being challenged about global climate change. So I think there's a dark side to it.
Lisa Falkson: [00:24:32] Have you guys seen any of the studies about how entering things by by voice is now 2.8 times faster than via text, just because the accuracy of speech has increased so much? I think it was a Duke University study. We certainly know that talking is faster than typing. But it used to be that the correction that had to be made via speech made it slower. Well, now the accuracy is so good that typos are actually more work to correct than speech recognition.
[00:25:21] One thing that Ahmed said just now, about people using emojis - it's actually kind of hilarious to think we're just creating something in text to give emotion that voice already has. So when I express love, or anger, or sadness, or whatever - that stuff is already encapsulated in speech. Emojis were created to fill a deficiency that exists in text because there is less meaning.
Bradley Metrock: [00:26:17] This Week in Voice is sponsored by VoiceXP, a company based in Missouri that's known for producing the highest quality Alexa skills and the highest quality voice applications around. Whether you're working for a big brand or a big company - a Fortune 500 entity - all the way down to a small or medium sized organization, or even a startup - or maybe you're a solo entrepreneur - VoiceXP is the company you should seek out for guidance in how to navigate the waters of voice applications. They're amazing folks. They've been fantastic partners with VoiceFirst, and fantastic partners with This Week in Voice. I can't recommend them enough. Reach out to Bob Stolzberg. He will help you. And you'll be really glad that you did.
Bradley Metrock: [00:27:14] Moving on to Story Number Five. Tel Aviv-based Audioburst launched a new search engine for audio news. This is an interesting one. Jon, what do you think about this search engine for audio?
Jon Myers: [00:27:27] In some ways I feel like Alexa and Google Assistant and Cortana are the browsers of VoIP days - kind of a place where you do your searching and get information that's connected by web services. This is proposing a way to do that across services and to reach out and get information that's on the Web already, outside of what's in the voice services. So it could be a way to break down some walls there. How that's going to come about is really tough to tell, based on the article and what we've seen from Audioburst. But it's a really noble idea too, because I think that we'll be able to accomplish a lot more when we can stretch across the entire Internet with voice.
Bradley Metrock: [00:28:17] Sure. It's interesting to think about it from a podcast standpoint; I was thinking about this just the other day. Podcasting is getting more and more popular. And yet the content that's generated by this movement of content creators is not searchable at all, unless you've got good transcriptions that are produced and then made available on the web. You've got to be able to parse that content just like everything else. Especially if we're moving into this voice-first era. I find it fascinating, too. Ahmed, your thoughts?
Ahmed Bouzid: [00:28:54] I just love what Jon said about piggybacking on the concept of browsers. I think it's absolutely the right concept to frame Alexa and Cortana and Google Assistant as browsers; they're platforms where people can share experiences. It happens to be that we're not there yet in terms of one standard markup language, but I think we may get there at some point. Maybe they'll have markup language running on the platform and people will build them using tools. And I like the idea of a search engine across all these platforms that focuses on content, not on the delivery mechanism.
Jon Myers: [00:29:36] I'm quite sure they're thinking about this very carefully. But you're right - sometimes it takes a startup that's nimble and can move quickly and deploy resources faster to get things done, to break new ground. And I think you're right: the nature is that that will either be absorbed, copied, or bought in one way or another. But I think it's interesting to see that it's happening already - I really didn't expect to see something quite like this. And who knows how far along it actually is? But I'm pleased and tickled to see that there's already some movement that direction.
Ahmed Bouzid: [00:30:18] I tried it out a couple hours ago. It's definitely something I'm going to use, which is not something that I say about just any skill.
Lisa Falkson: [00:30:26] Wow, that's a pretty strong endorsement.
Ahmed Bouzid: [00:30:28] You should try it out. I said "news about Kenya" because there was an actual election in Kenya. So I got five stories and listened to them, and they were spot on. And then I said "Rachel Maddow" and I got five returns from these blowhard right-wingers raining hellfire on Rachel Maddow - which was interesting because the first ones were topics in which she was mentioned. So what they're doing is transcribing all these stories and indexing them, not only if the person speaking is Rush Limbaugh, as well as if Rush Limbaugh is mentioned, or the topic is Kenya or Venezuela (that's another one I asked for). I asked for a few more, and they all came up and they were all things that had been posted an hour earlier. So I'll definitely use this thing, for sure. I think what they have is real.
[00:31:20] And I think that this is just the beginning of something amazing. By the way, Google has been doing some of this audio mining (that's what it's called, audio mining) for a long, long time. But they're sluggish and they have no focus more or less, aside from their main business which is churning money - their AdWords stuff, the Google Glass - Google is just disoriented. They want success. I think somebody will get into this pretty soon they know for sure - for SURE - there's a huge business in audio search.
Bradley Metrock: [00:32:01] By the way - today, tomorrow, or sometime in the very near future - the podcast will be searchable and available through Audioburst, which is cool. I've been talking to those guys and got them the feeds and everything. Happy to be working with them.
Ahmed Bouzid: [00:32:19] I'm going to try to find programs that mention Lisa Fox.
Bradley Metrock: [00:32:24] Well then, this is going to make it a little bit easier to do that.
[00:32:29] We'll move on to the final story of the week: Kim Komando provided a list this week of how Alexa can be deployed in the kitchen. And this is newsworthy in my mind, for the simple reason that when I think of people who are subscribed to the Kim Komando email list - my mother and several other people I know are, and I wouldn't categorize them as early adopters of technology (in many cases they're not late adopters, either). So I look at the landscape of voice and every time you turn around you're seeing something else happening that indicates this is here to stay. This has penetrated so deep into society and culture that it's here to stay. And it validates the premise that there's no turning back. Lisa, I'll start with you: do you feel that way?
Lisa Falkson: [00:33:31] I love this description because I read a stat a while ago that 51% of people have their Echo in the kitchen. And then from there it goes to bathroom, home office, and the bedroom I think is the least. I had no idea, for example, that there are 400 skills in the food and drink category alone. Now, this makes me want to go to my kitchen and mess around with them. But the great thing is not just the integration with certain devices, which I knew - like coffee brewers, the lights in your kitchen, things like that. And I've always used it for timers - you know, when you want to have multiple timers like your oven, your microwave, and your hands are all dirty. But the depth to which people have used this! Like OK, I'm standing in the kitchen - what do I need other than timers? Measurement converters - for example, counting calories is just terrific: how many calories are in this four ounces of salmon, or whatever. Asking a bartender how to make a certain drink - to me that sounds amazing. And we know shopping lists, timers - but one thing really appeals to my altruistic side is the "Stop Wasting Food." The food scale. How do you store this vegetable, or how do you know that you're not spoiling your food by storing it the wrong way. I think it's not only just a nice list to have - but it shows that there are so many things you can do on an everyday basis. I'm sure there are people who just think of Alexa as "the media device" - like "I listen to music and maybe I ask some FAQs, like how tall is the Eiffel Tower, or my kid asks me for their school paper, or whatever." But to see that there is this depth - of even just food and drink type skills. It's really nice. Again, as I have this in the kitchen it makes me want to go and look up other things I can use Alexa for in the kitchen. And what you said about Kim Komando - she's not really an early adopter or a late adopter, she's very mass market. Also the Alec Baldwin Echo commercials - make it something that people relate to, with someone very mainstream. I think that's a positive thing - it shows market penetration and that it's become a common thing. If it were something that no one had ever heard of, then Kim Commando would not be talking about it at all. She doesn't have to introduce the technology and what it's about. Instead, she's assuming that people understand and she's talking about application.
Jon Myers: [00:37:13] When Earplay got going in 2013, we were just something on a mobile device, a way to play with your voice. But we saw that people were using it, and we went and asked in surveys - what, where, and how are you using this? And we were kind of shocked at just how many people said they used it while they were cooking. Then we started thinking about this, and thinking about when you listen to podcasts and that sort of thing. It's interesting that it's a place where you can fill up the time. And I thought that was really fascinating. When I read the Kim Komando article, I thought YES, that's earth-shattering, something that is coming - that the kitchen can be a place to accomplish things with your voice but also a place to have some multitasking entertainment - you can listen to your podcast and other things while you're trying to set up what you want to accomplish in the kitchen.
Ahmed Bouzid: [00:38:02] On that point - voice enables you to do things eyes-free and hands-free, or when you're in a place where your eyes are busy and your hands are busy and it's dangerous not to pay attention - that's why voice is so compelling in the kitchen.
Bradley Metrock: [00:38:37] Sure. I've seen two scenarios in which Echo devices have started in one location in a home and then move around a little bit - and then the moment they reach the kitchen, they stay there and they never move. When my parents bought the original Echo - they were the first people I knew that had one - they had it in a couple of different places in the home and then it hit the kitchen and boy, that thing never moved. It's been there ever since. And then in my own house - I bought the Echo Show just for my own purposes and had it in my office - then needed to move it to the kitchen one day to do the drop-in calling thing. And it also has never moved. And speaking to the point of multitasking - my wife just told me the other day she loves having that thing in there so she can read the news that flashes up there while she's getting ready for work in the morning and so on and so forth. So it's very interesting the kitchen is calling out for this type of device. And I think you're right - this is foreshadowing a lot of what's to come.
Bradley Metrock: [00:39:52] Greatly appreciate the three of you setting your time aside and sharing not just your time but your expertise and your insights with us. For This Week In Voice, Episode 6, August 10th 2017 - thank you for listening. And until next time.