Top news stories for Episode 2 (July 13, 2017):
1) Amazon contemplates releasing user data to developers in the form of voice transcripts, triggering new privacy debate
2) Google clashes with Amazon in voice assistant "price war" around Amazon's annual "Prime Day."
3) Google launches Gradient Ventures, an investment fund to "provide capital, resources, and education" to "AI-first" startups.
4) Analyst report: Siri usage drops as Alexa and Cortana use rises.
5) Samsung and Galaxy S8 users do battle over the "Bixby Button."
6) Best Buy shares decline 7.5% on the news of Amazon launching its own version of Geek Squad.
7) Motley Fool openly wonders why Microsoft is releasing a "me-too" smart home speaker that doesn't appear to stand out.
8) Welcome to #VoiceFirst government: Mississippi and Utah among first to roll out Alexa skills to serve constituents
Panel for Episode 2 (July 13, 2017):
Dr. Ahmed Bouzid is Founder and CEO or Witlingo. Dr. Bouzid is also co-founder and Director of the Ubiquitous Voice Society, a non-profit organization dedicated to the mission of evangelizing the emerging voice interface, and author of two books on Voice User Interface design. (Dr. Bouzid's recent article on discovery of voice skills, referenced in Episode 1 of This Week In Voice, is here.)
Brian just published issue number 6 of Multiplex Magazine called The Enchanted Loom. He explores a new AI concept for Voice First systems called Artificial Understanding. Get the Read Multiplex App at the iOS store and subscribe for this and the entire catalog of magazines.
Lisa is an experienced designer of speech-enabled, multimodal applications on various platforms. She is also a member of the Board of Directors of Stanford Professional Women.
Bradley Metrock: [00:00:09] Hi, and welcome back to This Week In Voice for July 13th, 2017. We're very pleased today to have three phenomenal guests. We'll start with Lisa Falkson: Lisa, say hello.
Lisa Falkson: [00:00:23] Hello, nice to be here.
Bradley Metrock: [00:00:25] Lisa thank you very much for being here and joining us. Lisa is an experienced designer of speech-enabled multi-modal applications on various platforms. She is also a member of the board of directors of Stanford Professional Women. Lisa do you mind taking a second and explaining to me in the audience what that is?
Lisa Falkson: [00:00:42] Sure no problem. So Stanford Professional Women is really an organization for sharing and networking between various different alums, staff members, and faculty and so we host events throughout the year - usually one to two per month - and they vary from large speaker events to smaller networking events. And the idea is a lot of younger alums coming right out of school needing that professional support from other women in their industry and helping them along with their careers so we facilitate that.
Bradley Metrock: [00:01:17] That's really cool. Thank you for sharing that with us and thank you for sharing your time with us today. Brian Roemmele is our second guest - Brian say hello.
Brian Roemmele: [00:01:26] Hello. Glad to be here, Bradley.
Bradley Metrock: [00:01:28] Thank you very much for joining us Brian. Brian, I just subscribed to the Multiplex app for $29.99. It is phenomenal. If you're listening to this podcast you need to stop what you're doing. Hit pause - hit stop for all I care - go to ReadMultiplex.com. Download the app, purchase a subscription...you'll be glad that you did.
Brian Roemmele: [00:01:51] Thank you Bradley. I want to let everybody know I just put out an Android version it's in early alpha stages so it's a little little shaky but give it a try. But thank you.
Bradley Metrock: [00:02:01] Sure. Our third guest is Dr. Ahmed Bouzid - Ahmed say hello.
Dr. Ahmed Bouzid: [00:02:05] Hello Bradley How are you?
Bradley Metrock: [00:02:07] Doing good. Ahmed is CEO and founder of Witlingo which just this week rolled out the ability to turn your Facebook page - if you're any business any organization on Facebook you can turn your Facebook page into an Alexa skill. You need to stop what you're doing again. You hit the pause button a few minutes ago to go get Multiplex - you need to hit it again. Go to Witlingo.com. The first 100 businesses to sign up have all their fees waived to get their organization's Facebook page turned into an Alexa skill. You need to go do that.
Brian Roemmele: [00:02:40] I've got to say gentlemen, and Lisa, this is a phenomenal phenomenal skill. If you have a business and you do not have a voice enablement of it using Ahmed's amazing system get it right now. You really really thank me.
Dr. Ahmed Bouzid: [00:02:57] Thank you. Thank you. I appreciate it so much. Thanks a lot.
Bradley Metrock: [00:03:01] Cool Ahmed. And my name is Bradley Metrock, I'm CEO of a company called Score Publishing based here in Nashville.
Bradley Metrock: [00:03:07] And with that we'll get to the news.
Bradley Metrock: [00:03:10] So our first story for this week is that - and this is added at the very last minute, I might add - is that Amazon is contemplating releasing user data to developers in the form of voice transcripts for how people are interacting with Alexa and what they're saying, triggering an entirely new privacy debate. So Lisa we'll start with you.
Lisa Falkson: [00:03:33] Sure.
Bradley Metrock: [00:03:33] Is this something that Amazon should be doing? Or are we all missing the boat on larger privacy concerns with this?
Lisa Falkson: [00:03:40] Yeah I don't really see an issue with it. When I first participated in the early betas, I was actually at Amazon and working on the Echo system. You know people were very concerned about privacy and they thought oh people aren't going to want to have an always-listening device. They thought that that would kind of be a deal breaker for folks and when it really launched it ended up not being a big deal at all. And so I think the advantage we're actually going to see from having this important data available to developers is going to outweigh the cost to the user. And in reality there it isn't a huge change in net. All that data was already being stored anyway, it's just a matter of how it was available, and who it was available to.
Brian Roemmele: [00:04:30] You know Bradley fundamentally what we're talking about is a larger issue ultimately and I think what the reporters and a lot of the red herrings sort of issues that came up was more based on this fear of people listening to you all the time. And that's a very real fear. But it's also part of a larger situation about context. And that means as these systems develop into true personal assistants - that's where they really have deep context about who you are, where you've been, what you eat, where you go, what you shop, all this type of things - to make them really useful, it's a great payoff to have this. We are going to have to have this what I call the decision of this epoch on who gets to control that information. Who gets to audit that information? How is it secured? Etc.. Because I mean we've given up a whole lot with social media and Gmail. This is orders of magnitude more complex and more impactful for that individual if that data was to be released, or to be taken by rogue elements within organizations even within governments and some parts of the world. It can be very dangerous for a person. So that's the greater issue.
Brian Roemmele: [00:05:42] The small issue is this is utterances - these are phrases that people speak to the app or to the skill. And for the developer to really fine-tune under these current developer tools that are out there - in the future this will be different - but under the current developer tools, we need to know what you're saying to the app so that we can fine-tune it to better serve you. And so all we're getting is different ways of saying read a book or read me that book or you know all these different variations. So it was really that kind of thing. And I know Ahmed did a lot to try to clear this up. You went out and sort of straightened a few people out, huh?
Dr. Ahmed Bouzid: [00:06:25] The bottom line really is that number one Amazon and actually Google and Microsoft have already been doing this for a while now since their SDKs came out is they have been sharing the information, number one. Number two, it is not, as Brian said, it is not everything that you say - it is specific to the skills. And imagine building a mobile app and releasing it and not being able to tell what are people clicking on, what buttons they are or what you know what are they feeling with drop boxes, what items are they choosing, so that they can iterate and iterate and deliver much much better experience. So I think theres a lot of education - educating that needs to be done in terms of ensuring that the stories that are out there are accurate. But I think it's true also that there are concerns that we need to keep top of mind as we move forward with these interesting interfaces.
Brian Roemmele: [00:07:26] I got to jump in. Ahmed, how do you feel about maybe this whole industry trying to get together about understanding what the privacy profiles are going to look like? Whether or not we're going to just open it up for everybody's context to be bought and sold to advertisers? Or whether or not your context has to be existentially and in reality held in a highest level of security that we haven't seen before in this industry? Do you buy into that?
Dr. Ahmed Bouzid: [00:07:55] I definitely I think that there has to be some kind of a protocol or some kind of a baseline agreement on what are we going to do with that data. There has to be some kind of a warm and fuzzy between you know these interfaces and the people who are using them, only and mainly because the voice is such an intimate interface and people are going to talk to it in a way that is probably not as cold or as functional as they do with them. You know something where you push buttons and drop downs. So I think I don't know if we if we can see the giants coming together around the round table and coming up with W3C protocol. But I do think that I think if we are going to have a you know an interface where people are comfortable talking to it there has to be at least a set of guidelines or a set of agreements with the user about what this information: where it's going to live, how long it's going to live, what are these big companies going to do with them, and so forth. I think that that issue is going to come to a head and it's going to come to a head because of the intimacy that we have with language. And I think what we're seeing right now this is called hysterical, sort of hysterical, but it's an indication of a real thing because people do take language a lot more seriously in terms of intimacy than they do with the other more common functional, purely functional, interfaces we've engaged with so far.
Lisa Falkson: [00:09:28] I mean I think what both these guys have said is very valid. Voice is somehow more personal so saying that I'm gonna share with you snippets of my audio or words that I've spoken somehow seems like a much more personal thing than it is to say oh I've clicked boxes, you know, "A" out of "A through D," and you know I've selected this from a dropdown menu. Somehow that even though we might be actually doing the exact same thing in terms of what they're giving an application or a skill. So I think it's sort of a mental and emotional attachment that we have to our voice and to what we say that makes it feel more private than actually inputting, even inputting via text. I mean we enter things via text boxes all the time and we don't worry about that information being saved somewhere - I think it's sort of a visceral thing.
Dr. Ahmed Bouzid: [00:10:31] Like Gmail, like all our life is in Gmail, and nobody is bothered by that.
Lisa Falkson: [00:10:36] And they're parsing the e-mails and giving you know suggested responses even...I mean that could be considered an invasion of privacy.
Brian Roemmele: [00:10:43] So you make an interesting point Lisa and I don't want to belabor too much but you're saying that you're feeling the existential threat - the psychological threat - of the actual sound of people's voices and maybe the background context versus the raw text that came from that voice shared with the developer. Is that where you think the line is drawn?
Lisa Falkson: [00:11:01] Yeah I mean I think that just it feels more personal to give...
Brian Roemmele: [00:11:04] Makes a lot of sense. That's brilliant.
Lisa Falkson: [00:11:06] ...actual audio than even transcriptions or written text.
Bradley Metrock: [00:11:15] How much of a concern can you have when you're selling thousands of Dots a second...Brian you mentioned that I think I saw on Twitter you were saying thousands of Dots sold per second, like yeah, please, tell me how much privacy matters to you.
Brian Roemmele: [00:11:28] Yeah.
Bradley Metrock: [00:11:30] But one thing that would be cool though is if developers can see your transcripts on how you talk to Alexa you would be great if the user could see it too. You know I mean the user can see their Google searches. The user can see different information. As a user of an Echo Show I sit there from time to time, and I look at it...I think "what are you capturing?" Are you capturing every word? You sort of look at it with suspicion sometimes and it might be good if they provided that data to all users otherwise they just...the biggest crime they could commit is having stuff that doesn't work so they need to just turn everything over to developers that they need.
Dr. Ahmed Bouzid: [00:12:12] Yep.
Bradley Metrock: [00:12:12] Story number two: Google clashes with Amazon in voice assistant price war around Amazon's annual Prime Day. So there's a lot, lot of meat on the bones here. Amazon has a real juggernaut on their hands with Prime Day and Google's gotten in on the action with a bundle. Brian I'm going to throw this to you since I know you follow this probably more closely than anybody: what should we, the layman, and what should people in the in the tech industry, take away from what took place - the end results on Prime Day?
Brian Roemmele: [00:12:50] Wow, there's a lot to say about it, but I'll try to be really concise. We are in the midst of the very early days of the VoiceFirst revolution and it is a revolution in the sense that it is overturning a lot of the prior assumptions and how people are going to interact with what we call computers. What we call computers have fundamentally changed, when we went to glass and typing on glass and the mobile revolution - this revolution is no different. And you are absolutely correct. There is a demand. I mean this is...let's get a perspective here. Amazon's one of the largest retailers on the planet and on holiday season 2016 the best selling item was Echo Dot. Prime Day - an invention of Amazon which has now gotten its momentum to such a point where some reporters and some analysts CNBC reported that billions of dollars were lost in productivity because of people going online to see what's going on on Prime Day. And yeah it's phenomenal and it's brilliant on Amazon's part and it coincides with back to school, by the way. That's the folklore of how it started. In dog days of summer, people aren't shopping, let's make a new shopping day. It worked brilliantly. But now, Echo Dot - largest selling item in Amazon's Prime Day. And not only at Prime Day but they started Prime Day on July 5th, and it was called Alexa Day or Alexa Shopping Day. And so anybody who had a voice-first system, an Alexa system, they were able to get early deals and some of those deals were phenomenal. So Amazon's going full out on this. And while most of the other tech companies are calling them 'smart speakers' and just kind of saying it's for music, there's a guy up in Washington kind of laughing - that big Jeff Bezos smile - saying yeah you keep thinking it's about music meanwhile I'm selling a lot of these. And let me tell you: you're not buying an Echo Dot for music. The speaker is not high fidelity. It's ok, but it's not great. You can do better on your speakerphone on your mobile. So and yes of course you can Bluetooth it. But my point is people are fascinated with this. They're not just using it one time like a novelty and they're exploding use cases and we're going to start seeing that. Nobody's got real hard numbers publicly. I can tell you that a few people I know have ascertained some great insights. Some insiders have told me yes it did peak at thousands per second. The numbers are high as millions were sold. I think that's a little blue sky but I would say we're approaching maybe a million or so sold during the whole Prime cycle, from the beginning on July 5th to the end on July 12. And so it is clear that we are moving forward. Now what does that mean for Google?
Brian Roemmele: [00:15:43] It means for Google they have to understand what exactly is driving the psychology. And one of the things that are driving it is voice commerce and whenever I say that to people they still scratch their head like when I said web commerce or mobile commerce years ago. It's like it's intangible, I can't touch it and feel it. But you have an Echo Show - right, Bradley? - and you can actually see it when you need to see it. A lot of times you don't need to see Scotty paper towels - you just order it. You know there are certain things and you go down the Maslow tree of hierarchy of buying...after a while you're going to see that almost 80 percent of what you want to buy you really don't need to see. Especially things like food ordering which is going to explode in the next year and two and I can tell you there's some very large quick service food companies that are going to fundamentally change the way we order food. We're never going to use our thumbs again once we can say well I hope it's not order a Big Mac I don't want to be negative here but they order me something really healthy and your personal assistant orders it and it's ready and it knows when you're going to pick it up or it's going to be delivered. It's going to be phenomenal. So the big deal. The Prime Day - ground zero - Echo Dot. I'm told that in the top 10 was Echo itself which was cut by almost 100 dollars and it was a steal. So the Echo Dot and the Echo original are selling like hotcakes even post Prime Day. So we're going to see a lot of use come from that if there's thousands of these.
Lisa Falkson: [00:17:14] Yeah I mean I think...One, I was really impressed from the very beginning when the Echo launched in terms of the adoption. Just the idea that there would be a voice-first product like this without a screen you know really just with minimal cards and that it would be that popular. And I think some of the things that Brian brought up are really true. There are things that you absolutely do not need a visual experience for, and I think we're realizing that even as consumers. When I am ordering cat litter, you know reordering anything that I've seen before, and I already know...it's really only in the cases where you're browsing for something that's extremely visual. Say it's clothing or jewelry - something that has a strong aesthetic. Other than that I think being driven by shopping is an interesting point. There is some good data - and I can't quote it exactly, so I won't try and be wrong - that shows in households where there is an Echo device, spending on Amazon is boosted obviously because it's that much easier. It's another channel into Amazon's marketplace. And Google really doesn't have that aspect to it. I mean how many people do you know on a daily basis that use Amazon versus do any sort of Google Shopping? It's a very different market. So I think that Amazon's done a great job of pushing not just the Echo itself but all of the cases where you could use it to order things. They put that in the commercials and so on and so forth. So it's not just for asking trivia and listening to music, although it does a great job for that too. It's also a device to use for shopping. And it's a very effective one.
Dr. Ahmed Bouzid: [00:19:23] Yeah. So my number one observation is I love competition. I'm very glad that even giants have to compete against each other. So I hope that...so now I'm cheering for everybody else other than Amazon. I love Amazon, I was there and so forth, but I want the competition to survive and thrive. Number one. Number two, I don't know what Google is...why did they go ahead and play the game of Amazon? I think they should just have kept their cool and sold and differentiate themselves and not cut prices and just sell on a value proposition. I think they definitely have a lot to offer and I think they need to leverage all the properties that they have. Everybody uses Gmail, everybody uses Gmail calendar, Google calendar...so they have a lot of assets they need to leverage. I think they just need to calm down, come up with a strategy - Amazon's strategy has been very clear from the get go. They're not in the business of selling hardware. They're selling the hardware to get market share in terms of usage in terms of promoting their baseline business which is commerce. Google is about something else. I think they just need to sit down, reflect, come up with a strategy, and pursue that strategy doggedly and in a way that is as focused as Amazon is pursuing theirs. So it's going to be interesting to watch how all of this unfolds.
Brian Roemmele: [00:20:47] I got to regroup and echo what both Lisa and Ahmed said about not only Google but Apple and Microsoft. All of these companies have fundamentally strong platforms that they can build around. Commerce is going to be playing a strong part in everybody's platform at some point. But you really need to pay attention to what your users could really use it for. And I think part of the problem is people that are coming up with some of these ideas are overengineering it. They're getting overly concerned with you know that's not what I would do I would just check my Gmail and Ahmed made a perfect point. You know your Gmail account could be ground zero for an incredible revolution inside of Google. It just takes listening to the right people because I know they have some amazing people inside of Google. I just hope that they listen to their voices and I mean that really literally because they could actually make a renaissance inside the company. And the same is true with Apple and Microsoft. I think the problem and I don't want to hurt anybody's feelings that's running these departments. The problem is just like when Steve Jobs walked into the Palo Alto Research Center and said guys you got a product ready and they said no...those engineers would have spent another 20 years fine-tuning it. Steve just took it. Ran with it. And we're living with that and I think that's what needs to go on inside of Google. That renaissance needs to go inside of Google Apple Microsoft and even in Samsung to some degree.
Dr. Ahmed Bouzid: [00:22:16] Oh yeah.
Brian Roemmele: [00:22:17] And to finalize: what Lisa said, yes it's 10 percent lift, approximately, when somebody has an Echo device in shopping on an Echo device. So it's a very real lift in transaction volume that Amazon's experiencing. In fact in the latest posting I did with ReadMultiplex.com I posted an article where I pointed out that Amazon can start giving away Echo Dots for free just by the fact of the lift that they get from transactions so keep an eye out in the future. These devices will be free at some point.
Bradley Metrock: [00:22:50] It's hard to see what Amazon did on Prime Day as anything else except a resounding exclamation point success for voice technology all around. So that's great.
Bradley Metrock: [00:23:02] We'll move on to story number three which is that Google has launched something called "Gradient Ventures," which is an investment fund to quote provide capital resources and education to what they call AI-first startups. So Ahmed I'm going to start with you. Also I should mention that Toyota launched another fund similar, with similar scope to this, but I intentionally left that out. Google - this has been talked about - some people are upset because the investments that Google makes with this investment fund are going to stay on their balance sheet. I think that's probably trivial but Ahmed how do you especially being founder of a startup in the space..what does this signal to you? What do you take away from this news story?
Dr. Ahmed Bouzid: [00:23:55] They would do a massive service to the whole space if they didn't think like coders and thought that the world revolves around coding or around technology. I think they need to think about how do we build an ecosystem for the conversational, the voice-first conversational interface which should attract talent across the board. Not only engineers but you need to have anthropologists, you need to have people who know how to craft language, and you need to have people who know how to test for voice. There is a whole ecosystem that needs to be built. And I would love for somebody who has money whether it's Amazon whether it's Google or whether it's Microsoft to start thinking holistically. I mean if you look at Amazon for example and they do a hackathon after hackathon after hackathon after hackathon...I've shared this concern with the people at Amazon since we are a partner of theirs and so we get to speak our mind is please don't call them hackathons because when you call something a hackathon anybody who is not a hacker is not going to show up. Call it something else. Be more inclusive because you will need all of that talent to take these skills that continue to be sub-mediocre to the next level. In addition to other things that need to happen. But at least start attracting the right...or start exciting, start evangelizing to the whole spectrum of talent that is needed. So I mean these ventures, they call them AI-first, or wherever they want to call them, if the bottom line is that it is going to be centered around development and technology we're not going we're not going to move the ball forward. We're gonna expand horizontally but not three dimensionally I would say.
Lisa Falkson: [00:25:43] Well I just want to echo what Ahmed said. I mean I see a lot of investment in the underlying technology, and not so much in the understanding of how to use it properly and how to - you know obviously I'm biased here - but how to design something that actually works properly. It's very different from how to just hack together some code - that really isn't the challenging part. And I actually participated in an all day workshop at Amazon RE:Invent last fall where I was just really disappointed. They said oh well we're just teaching the development and I said you know the design that you're having us implement is really non-intuitive and you know it's not a good voice user interface and couldn't you have had one of your designers put this together? "No, no, we really just want to teach the coding." And I said "well, but now what you're doing is you're sending all these, you know, 100 people back to their companies with this is an example."
Lisa Falkson: [00:26:46] So we think new investment in technology is important - that's where it all starts. But I think we have to also look at things as an investment in the applications and creating the proper applications. And I think that's why you know places like Witlingo are important because they actually take something which is very intimidating - you know hosting and designing your own voice-driven skill - and make it accessible to certain other people, so I think that's an important thing to have in the industry.
Brian Roemmele: [00:27:24] Well I got to say what Lisa and Ahmed had said is brilliant, beautiful. You know, I'm going to get philosophical here. Part of the problem - and what Lisa pointed out - is very very critical here is let's look at the wording: AI-first. AI is the underlying technology. It's like talking about the engine in your car and not the driving experience or what the automobile represents emotionally to the user. The interface that's going to interact with AI, at the end of the day, is going to be your voice. Sooner or later we're going to give up our thumbs because there's a cognitive load and a mechanical load. Broca's area is already giving us a voice. Everything we ever type is already a narrative, a voice in our brain. And so what, we've got to slow down and try to use two thumbs...de-evolve to try to interact with it. So the fundamental issue here is and I've had this argument with a lot of AI researchers and some of these executives at these companies I go you need to change your thoughts about this fundamentally. And I can understand how it's happening. Engineers and technologists are coming up with the future. They're saying oh my gosh we have AI we're going to lead with AI. That's cool. At the end of the day the AI is going to drift to the back. Nobody's going to care what's in the engine. Most people will buy a car. They don't ask how the fuel injector works. Or if you got a Tesla exactly what kind of battery operating system is working.
Brian Roemmele: [00:28:46] What people care about is the interface. We talk about the graphic user interface. We didn't talk about the processors. We just cared whether or not things moved around the screen quicker. And what Lisa is saying is brilliant. What we really need to do is create a renaissance of pulling people into this world that are otherwise barred from it. And what Ahmed said: calling them hackathons is ridiculous. Let's stop it. What we need to start doing is finding ways to pull people who are brilliant and constructing conversations and brilliant and understanding human interactions with each other and to try to build that functionality within us reacting to the computer. We have spent the last 60 years trying to be more like the computer. The next 60 years is computers going to be trying to be more like us and understanding us. And there's a turning point. We're here right now this moment and we're shaping it - all of us on this show. We're at this turning point where the change is going to come where we just say things to our personal assistants - obviously we're very early days here - and it understands us, it understands us to a deep level, and it acts upon it like a big lever. A lever gives you more power, so we're not sifting and sorting at the end of a Google search. When we say "find this," it knows our context, it finds the right answer, gives it to us...when we are shopping, "find this," it knows our taste, what we like, colors, and it shows us maybe two or three things that we might like, instead of nine million things that a Google search drops on our lap. So it's a shift in mentality. It's a shift of maybe letting the creative people take over this technology once again, just like Steve Jobs allow the creative people to take over the technology for the graphic user interface revolution and the mobile revolution. Today the unfortunate part about it is we're not seeing it yet and not even from Apple. There seems to be this fear that we got to make it very exacting and it's got to be you know only certain types of utterances which are stupid and boring in a lot of cases. And I mean stupid in a nice way - I mean stupid in a way that it was constructed in an interaction that humans never have. A stupid interaction is...we don't want facts. When I say this...when I say "hey, what's the traffic like?" I don't need to know all the statistics. I just want to know if I'm going to get to my appointment on time and my AI is going to understand that, sooner than we realize, and the company understands that is going to rule. That make sense?
Brian Roemmele: [00:31:21] That makes perfect sense. That's great analysis. Moving on to story number four. We got an analyst report this week that's kind of interesting. Siri usage over the last year have dropped significantly, as Alexa and Cortana use has risen. Siri remains the most popular virtual assistant with 41.4 million monthly active users in the U.S. but has seen a 15 percent decline since last year or 7.3 million monthly users. In addition the study found that engagement with Siri was also dropped by nearly half during this period, from 21 percent to 11 percent. And Brian I'll start with you on this. What do we take away from this? Is this as bad for Apple as it sounds like, or is this just something that's happened because the Echo got popular, and can easily change tomorrow?
Brian Roemmele: [00:32:18] Great question. I can say it in one phrase: neglect has its consequences. Apple owned this marketplace in very early days. SRI International created Siri from a military contractor contract essentially and it became a spin off and it was a last dying act that Steve did before he passed away - I mean literally - was to acquire Siri in his last major executive decision. He saw it as the future of Apple. I think...I'm a fan of Apple. I'm a deep fan of Apple. I love the company, but there is no other way to look at it than this is an entirely bad element of neglect. They lost most of the Siri team they had the opportunity to acquire the Viv team - they didn't, and that was fundamentally dumb on their part since they already lost these guys, they should have got them back. Viv was incredible technology and I hope to see it - we don't see it yet. Bixby at Samsung is not Viv technology. And for those that don't know the folks that formed Viv were the former Siri team that left Apple after about a year, year and a half, on average. It's a bad sign but it's also hopefully a good sign. What it should be is a warning sign to Apple to wake up and stop just looking at this as some sort of appendage and give it the rights that it should have. This is a brand new operating system. It's a brand new modality. Build around it. Let's not call it a smart speaker anymore. Let's call it what it is. You know I call it a voice-first device. Make up something else. I tried. Twitter's only got so many characters, so it's voice-first for me. But you know...name it. Own it. Have a voice OS. Call it voice OS. Call it SiriOS. Do whatever you need to do, but really build this new modality. And there is an inner turmoil within everyone of these companies. Let me tell you - every one of the companies of all the tech companies I have insiders that talk to me all the time because I am a lightning rod for this. I talk about it a lot. And they beg me please help me come to work at the company do whatever you can, shake somebody. Let's move forward on this. Apple has that ambition inside, and if there's an Apple executive listening to me who might get mad at me...inside of your company right now are teams have people. People that have not necessarily the best engineering skills but the skills that Lisa and Ahmed were talking about that want to take over this process and want to grow Siri into the right direction to make it more responsive and to start investing into better microphone technology which by the way we'll hopefully see the new iPhone. One of the reasons that Siri dropped - just to put a nut on this - the reason why Siri dropped is the expectations raised, and the expectations raised because it is brilliant farfield technology in that radio microphone array that Amazon built in the Echo. It was a high-water-mark. And when you go back to your phone - your iPhone specifically that it has a microphone designed for speaker phone conversation optimized for that and not for you know voice recognition - you're going to have a very unpleasant experience. I can tell you that HomePod has an enormously better microphone technology. It might even be better than Amazon's. I only had a very brief time to interact with it and if the iPhone is built voice-first they're going to have a lot of increase in the use of Siri. Google by the way is doing really well but I wouldn't have done two microphones. Sorry Google folks you need at least four microphones in that system to work correctly.
Lisa Falkson: [00:35:57] I mean I guess neglect is a good word for it. I've just seen a lot of you know we were the first Siri is still the leader. It's almost an arrogance that has not forced to innovate within that team. And I know they lost a lot of folks that left Apple. But I think they haven't put a lot of emphasis on rebuilding that, and in specifically in the expertise that Ahmed and I were talking about. Not just developers, not just technology people, but real true designers. I mean think of also what Amazon did fairly quickly which you know Siri never made available is having this developer ability to add skills. And opening up as a platform. And I think that's something that Apple, being a very closed place, sort of missed out on - letting people design their own functionality for a personal assistant.
Dr. Ahmed Bouzid: [00:37:06] It is to be expected that our usage will drop obviously because people are going to use the Echo at home and Siri outside of the home. I would be interested in seeing whether because people are using voice a lot more at home because of Echo, whether using Siri outside of the homes picked up. Number one. Number two, I think they - echoing Brian's observation - I think they missed big time in terms of capitalizing on their early lead as far as Siri. And number three I think we will see a pick up again of Siri when the HomePod comes in. I think that we'll see some redress there, I think, because who have Siri will be able to use it and will use voice at home as well as outside of the home so I don't think again as I said last week never ever discount or think that Apple is going to be out of the game in any way shape or form.
Bradley Metrock: [00:38:08] So now we turn to story number five which is that Samsung and Galaxy S8 users are doing battle right now over the "Bixby Button." This is a fascinating one. There is apparently a Bixby button - I don't have a Galaxy phone - but there's a Bixby button on the Galaxy S8 that's designed to hit it and you activate Bixby. And apparently Bixby didn't work right at first and Galaxy users or Android users are used to be able to customize everything...but yet Samsung won't let them customize this. Ahmed, what do you make of this battle going on over the Bixby button right now?
Dr. Ahmed Bouzid: [00:38:44] Well I'm just going to make a general point which is I am not a fan of paternalistic dictates from giants. Whether it's Steve Jobs deciding that hey you guys who are not as smart as I am I'm going to tell you that you don't need a floppy or you don't need the jack. I just find it offensive. So anything that comes from on top on and is imposed on others is something that I am not very keen on. So I'll just leave the thought at that. I think the direction that should be taken should come from customers - should be a conversation. Some people might say well wasn't Steve Jobs right to take the floppy out? And I say well you know I use the thumb drive right now to be able to take my files from computer A to computer B, you know? So I still I still use a thing to be able to transport things when I when the cloud is not doing its thing. So the bottom line is it should be more of an organic solution to two problems as opposed to somebody with things that they are being benevolent imposing upon us what is...how it should be. Because they know better than us. It's just a gut reaction.
Brian Roemmele: [00:40:04] Well you know I'm going to echo what Ahmed said. In the Android world it's an egalitarian sort of environment. And you know I understand what Samsung is trying to do and there is something to be said about uniformity in your design. And one of the beautiful things that attracted me to Apple is that they've developed a uniformity in their design. They created their own reference platform because it's the only platform and things are where you expect them to be. You go into the Android world things can be all over the place. Icons can be here. Some buttons don't operate the way they should. And obviously manufacturers can create their own buttons and this creates confusion. Yes, the problem with Bigsby is it is not Viv. It is something that was developed actually to work very well with Korean language and there are really tremendous issues with the technology initially. Now they've gone very far into correcting that and I wouldn't bet against Samsung for correcting a lot of this. It's just like Ahmed said, "Don't bet against Apple." I especially say don't bet against Apple. Apple's going to rise above everything unless something really desperately goes wrong. So will all these companies. But it is probably not a good thing for Samsung to have this battle because if the system was good enough people wouldn't want to be replacing it. That's really the subtext to all this.
Lisa Falkson: [00:41:29] Yeah I mean I think what Brian said before is perfectly valid - it's really to me though it's buttons, and remapping buttons, and all that stuff is that that's maybe a territorial thing. People over hardware. I think that to me you know one of the great things about the Echo device and even having Hey Siri and OK Google is not needing a button. And that's sort of where I stand is that you know to whatever extent we get to a point where we don't need buttons at all, and we're purely using wake words, that would be totally fine with me. So I mean I understand the argument about it. Going back and forth I think it's like I said it's sort of a territorial thing.
Bradley Metrock: [00:42:22] I completely agree with you, Lisa and Brian and Ahmed. I think that probably probably a sign you need to go back to the drawing board rather than cram something down somebody's throat. Just my two cents.
Bradley Metrock: [00:42:35] Moving on to story number six, and this is an interesting one as well. Best Buy shares declined 7.5 percent - I think it was more than that before it rebounded a little bit - on the news that Amazon is launching its own version of Geek Squad. And Lisa I'll start with you for this. What are your thoughts? How does this story strike you? And how do you think this will facilitate the evolution of VoiceFirst?
Lisa Falkson: [00:43:05] Well I mean first of all Amazon is obviously planning to take over the world completely. So we should all be prepared for that. But the whole using voice, for example, with your Echo...think of how nice it would be. I'm looking at my outdated TV right now and I've been dying to upgrade it. But I had bought it in-store and I had Geek Squad install it. And I was thinking, "Gosh if I order another one via Amazon, I'm going to have to figure out how all the nuts and bolts of getting this redone." How great would it be if you know voice-first I could say "order this new whatever Sony Bravia in x size and include installation from Amazon"? Then it's one step and I know how much the full package costs. To remove that layer of pain from the customer I think is really what Amazon is doing and what they're what they've been sort of great at doing in general, if you look at sort of the evolution of the company. It's just do one thing - sell books online - really really really well and then turn them into selling everything online really really well. I think if they do this successfully this is going to be a great additional offering from them.
Brian Roemmele: [00:44:45] I have to say what Lisa said was brilliant. Absolutely. You know really when we go outside the tech areas of this country and it's very easy for us all - all of us are technologists nerds and stuff - and we can kind of you know see things happen very quickly. The rest of the world but most definitely the rest of the United States outside these tech corridors are moving at a much different pace and they confront the things that Lisa brought up every single day of their life. And you have people that are too old. There are people who just don't understand the technology.
[00:45:20] And so getting into somebody's home which has really been the subtext of what Amazon's been trying to do with Alexa and now with this version of In-Home Service is extremely valuable for the company because we're not just talking about entertainment. We're talking about home automation on a massive scale. And this is going to take place. Most people don't have the wherewithal to do even some of the basics. I mean certainly people can install lightbulbs. But as we move down the pyramid of all different systems that people are going to wind up doing such as thermostats and you know all the advanced lighting systems sound systems throughout their home how to integrate all these new Echo products because most definitely Amazon's when they come out with a high fidelity Echo system very soon also. So Best Buy should be concerned but it also should be a rallying call because listen there's not just going to be one brand. There's just not going to be one solution. No market ever develops just one brand one solution. And like Ahmed had mentioned before this incredible amount of competition is going to be really great for this market. And by the way we haven't seen everybody enter the market. We're gonna have at least 10 more major companies enter the voice-first market over the next 12 months and they are all going to be massive and they're going to be surprises and Best Buy must best maybe look at this and say OK what are we going to do? We're going to lead with voice, we're gonna lead with home automation. We're going to own what we own and they own a lot of the local market. You know if I was consulting for Best Buy I got a plan right now and what they could do and move massively not by blocking Amazon but potentially by working with them - you embrace your enemy if you want to see it as an enemy.
Bradley Metrock: [00:47:14] No that's great and I think that Amazon...this is a good move for Amazon but they've got to be careful. And we talked earlier about privacy concerns. Amazon can do what they're doing with the Echo and the Echo Show and the Echo Dot in the Echo Tap and all these different things because they have cultivated such a brand identity of being trusted and they're trusted for a number of reasons. They're trusted because every time people interact with them on a customer service basis you know through retail they do a good job. They're trusted for many different reasons. But if people get the sense...this could easily turn around on them if they're not careful - this whole Geek Squad competitor - if it's seen as some sort of Trojan Horse to you know to get inroads on people's privacy, or not, you know, not, in the end, serve the customer as much as they should be, it can quickly turn negative. As long as they don't do that, Best Buy's got a whole lot to worry about, and that was great analysis from both of y'all.
Bradley Metrock: [00:48:18] Let's go to story number seven which Brian what you're saying is a great segue. Motley Fool openly wondered this week why Microsoft is releasing a, in their words, "Me too" (and actually it's probably my words too) smart home speaker that doesn't appear to stand out and we talked a little bit about this last week. Brian, my question: do these companies understand that they can't just release another competitor without some sort of killer app, or is Microsoft just going to release some nameless faceless thing that is just going to end up on the clearance bin two months later?
Brian Roemmele: [00:48:56] Really great questions here. I think I certainly still have some great insight on this. Part of the problem and it's a theme that we're going to probably have all the time in this show and most definitely what I've said today...part of the problem is we have engineers and technologists that are putting this stuff out and they're looking at it from a practical standpoint and thinking that people just want to hear facts and they just want to get you know you know how many miles are we from the sun and exact things like that.
Brian Roemmele: [00:49:25] And I use the traffic scenario again as a really good...everybody can get this but I have thousands of them...but the traffic scenario is you know if I'm asking you in the car mobile "What's the traffic like?" and you know my calendar and you know that I am X number of miles on my way to an appointment. The answer I'm looking for isn't what the traffic is like. The existential question is what I'm really looking for - and this is the voice-first stuff that I build, it's what I work in my lab on all the time and it uses all kinds of psychological insights about how humans operate and you know I go all the way down to Broca's area and all the way up to the existential mind - but what the existential question is "am I going to be late for this meeting?" And that's the question that needs to be answered. When you have a company that's primarily driven by engineers build these devices and build the voice-first interfaces, they're going to create what appears to be maybe a me-too product. It may be incredibly good - the technology might be the best, right? But the functionality because of how humans communicate may be much lower. And you know what Lisa has brought up and what Ahmed has brought up...how do people really interact? So I'd really like to ask Lisa :how do you feel about Cortana? And how it interacts in Windows? And do you think that that's...just drop that into a speaker and everything's fine? Do you think that's the way to do it?
Lisa Falkson: [00:50:55] Yeah you know it's really interesting because people ask me about Cortana all the time and in my previous company I was working on you know human/machine interactions for cars. And our CEO happened to be a board member at Microsoft. Someone said OK, we're going to assume that you know Microsoft is a potential partner here. And they said "what do you think in terms of Microsoft technology?" And I said you know you look at how their installed in desktops, how well it works even if you tried it on mobile. I don't think it's a technology problem. I don't think it very strong necessarily on design and personality...like if I were to ask you what Siri is like, you would have some definite input on that. If I were to ask you what Alexa is like, you'd have some input on that as well. OK Google is deliberately neutral - we all know that. But if then you were to say Cortana, it would almost be as though it wasn't deliberately designed. Like no one would say oh it's trying to be neutral like Google but is really has a personality more like Siri. They would just say Cortana, it just never just didn't occur to me what the persona was. And I think that you know a lot of us in the voice field know this - this is a very dangerous thing - if you don't deliberately design a personality, and you have one that happens accidentally, because it will - people attribute personal traits once there is a voice - that personality can be all over the place. And so I think it's sort lacking in the design field in terms of you know if you were to ask personal questions or sort of even the consistency of how it responds but in terms of technology I think it's very strong. I just don't see I guess what the differentiator is going to be.
Brian Roemmele: [00:53:11] You know Lisa brings up an incredible point here. And you know humans will anthropomorphize anything. I mean you just put anything visual humans will...you know a dog has got a personality. This belief...again this is what happens when engineers and technologists take over. I completely disagree with Google's neutral stance. First off it needs a name, in my view. And if you don't choose a name people are going to give it a name and it's probably not going to be a pleasant name. People don't want to be saying a company's name all day long if they're going to be interacting with this. It was a strategically huge mistake. I believe that you should be scripting whoever you are building, and whether you're the company building the voice-first platform or whether you're building an app, you need to you need to build the personality first. You need to storyboard it. I'm in Southern California and Hollywood and I work with a lot of creative people that storyboard this. And if you're a brand if you're a company you're not storyboarding your brand? As far as what a voice-first system is going to sound like? And I'm not just including the voice. That's a big part of it - tonality, inflection, all of this stuff. But this attitude that we're hard-wired to detect...we're listening to her mom's voice before we exit the womb. We already know voice before we know anything else and we can detect voice much quicker and much more accurately the moment we pop into this world and we can detect the nuances. It's been said that most people can understand the differences in over thirty nine thousand voices. I think the number is a little higher but that's where these studies kind of stop. You cannot detect the differences of thirty-nine thousand people at a distance, but you certainly can in somebody's voice. So the brain is designed for this. And the fact that there is just this myopia in design that's taking place...you know what I would say for Microsoft - they have strong points. Everybody has a strong point but you have to go to ground zero. You have to listen to folks like Lisa who has been doing this for a long time. The other person who is incredible is Nandini Stocker over at Google. These people have worked in voice for a very long time. They build these interfaces because they understand how humans interact with systems. And once we get to that point then we've now created a relationship between the user and the personality of the device. And if you think that that doesn't matter that might be your extinction. If you're in a large company and you're putting on a voice-first platform and you don't think that that stuff is a big deal? That is the beginning of your post-Cambrian explosion. This is your extinction message. It's not like you're going to go away and die. It is going to shrink down to become less relevant because the companies that get the true elements of human interactions are going to dominate this, especially as we go into the next phases, which by the way is probably not going to be next week or next year but it's going to be soon enough for people to make note. And yes Amazon has gotten a little bit ahead and Siri actually is doing really good. I don't necessarily think the new version of Siri is as snarky but you know maybe that snark is changing to something more mature. I don't know but personality's definitely there. So yeah I fully agree with all Lisa said.
Lisa Falkson: [00:56:34] Yeah I think another interesting thing that we haven't talked about and you know sort of came up with the Echo Show...so there's the Look and the Show that came out around the same time, so I get confused occasionally...you know when you start having these multi-modal interactions to still think of it as here's a voice system that's being supported by graphics. I've worked a lot on as I said these car systems where it really was "OK let's do the screens first, we do it screens first, we do the screens first, and then we tack on the voice." And I think the way the Echo has been...has evolved and the way that all these other products have come to market is great you know for people like me who, as I said, have been waiting for decades for this to happen where it really is...it's first a conversation. And there are elements of conversation that we all enjoy. There are reasons that there are some people you like to talk to and some people you don't. And those elements of conversation are things that we that we pull away from interactions with automated systems. There is a good taste in your mouth after you have a successful conversation with an automated system the same as there is when you have with a person and that makes you want to interact with them again. And I think you know watching these systems evolve...watching the multi turn dialogs become more successful and watching people feel more comfortable in general with speech recognition is you know really exciting for me and I think you know that's going to continue for the next many years and I think you all the big players should be in at this fight. It's no surprise that at Microsoft is there. It's a matter of how does one differentiate oneself at this point, when there already are some good solutions out there.
Bradley Metrock: [00:58:38] All excellent points - Lisa and Brian, thank you very much. We're going to move on to number eight. Our final story of the week - and this is an interesting one that I'm sure is just the beginning of us talking about on this show and other VoiceFirst.FM shows - there was an article that came out. I'm going to pull it up now...about how Mississippi and Utah both are among the first, if not the first, to roll out Alexa skills. So Mississippi and Utah both have Alexa skills out now: Mississippi has one related to vehicle licensing and registration. Utah has one related to fishing and wildlife and being able to...for state residents to be able to find out information about fishing hotspots, so on and so forth. And Lisa, for this last story, I'll start with you once again. Does it concern you to see government using voice-first technology and embracing it? Or are you for the most part just thrilled that we're seeing this sort of progression out of these type of entities already at this point?
Lisa Falkson: [00:59:59] Honestly I am certainly shocked and amazed. Usually government is last to jump onboard. A little bit late to the game. But just the fact that these are sort of fun, interesting applications particularly the fishing one. I think it's great that the government - you know, that any government entities, especially sort of not you know not in California and New York - are sort of dipping their toes in the water here and trying this out. I'd be interested to check them out and try them. But I think it's kind of nice to see what some of these applications are. There's a great opportunity here for education through the Echo devices - you know, learning how to...learning things about your state, learning things about your government. And you know again the fishing is more hobby related but I think it's great. Honestly if the government get some practice with some lighter apps like this and then can actually have some real true informational stuff. I mean if you look at the way...to me, we used to go to libraries, right? . If we had to look something up. And now we'll sort of go to Google and type it in on our phone. But it's it's great if people can just ask the question aloud. It's kind of our..you know, always my last resort. If I can't get something done on the computer and...reach something via email or via web page, then I always pick up the phone and call because I know that voice is the easiest way to get something done. And I think that's great that you know in even in government there seems some ways of incorporating voice and saying OK someone wants to get something done quickly easily efficiently which is not usually what we think of with government then they'd actually you use voice to try and find that information. So I actually think it's great.
Brian Roemmele: [01:02:13] I fully agree with Lisa. I think the shortest distance between you and what you want is your voice and this is ironically echoing the things I've seen in development. Let's just say there are a number of states very seriously considering converting everything that's available on their website into a voice type of interaction and it's going to be actually some of them are going to be doing some incredibly new more powerful things to interact with representatives and interact with various community organizations, things of that nature, within their you know their local counties. There is also an initiative at the federal government level. There have been a number of federal contracts that have gone out for building voice-first interfaces on a federal level. This has received a lot of bipartisan support which is good to see at this stage of life to get this to move forward and I believe that we'll start seeing the results of that by as early as this year. But most definitely next year we're going to be able to get access to a lot of federal services literally responding to forms making requests all the way to Veterans Affairs and things of that level to information about traveling abroad. Things of that nature. I believe that that is going to be extremely powerful. It's going to be a revolution. And I don't see that stopping. And it's going to be similar to what we saw what took place with Web sites right. First web sites were almost wordless or business cards or brochures until they became extremely extremely useful. The point where we're filling out forms and I hate using this terminology but visually you understand what I'm saying. Where we're completing the form just by talking to our device like as if somebody was interviewing us and we're giving up that information in a very painless sort of way to get to what we need to get done especially for people in need who need services. That sounds crazy but there are some people who don't have access to even an internet connection. Fine. I get that. There are some libraries that are already booking Echo devices into soundproof rooms where people can interact with these systems. So we're going to start seeing that. And here's the other problem: literacy. There are some people just simply know how to speak but they can't read or write very well. And this is lowering the bar for access, especially for various government programs that can be extremely helpful. So I think it's a good sign.
Dr. Ahmed Bouzid: [01:04:50] Yeah if I can just piggyback on that...there is a whole segment out there that that can make the most of the best that voice can offer, which is to act instantaneously upon hearing an action call. So for instance a lot of activist organizations out there were focused on on a political cause on a social cause who for example are watching CNN or whatever it may be and they see a bit of news they don't like and they are affiliated with an action group and they want to react immediately. Alright? So instead of having to stop and go for the laptop and do x, y, and z, they can just say you know Alexa ask so-and-so for the latest news and they get the action calls and it can say "call my senator" and it connects them to the senator. So we're working on a couple of skills like that, where being able to react immediately in the heat of the moment in the flow of your engagement with content and being able to within a minute leave a voicemail to your senator or talk to staff or engage with somebody who is helping you mobilize. I think there's a whole world there that is opening up with this interface.
Bradley Metrock: [01:06:16] That is great and my personal opinion on this is this is as I mentioned that far from the last time we'll be discussing this on this show. I thought the point about accessibility is really key. This is going to open doors to a lot of different types of citizens to access information and participate in meaningful ways. But also you know not to not to get political but to me that this is no surprise that this is going on. I don't know what Utah is but Mississippi being a red state, I mean, I think we'll see this of red and blue states all over the place. There's a lot of pressure on government to perform and this sort of technology is just what the doctor ordered. I certainly hope, I would love to do a story like this every week from here on out, that government is taking a more voice-first mentality. So I agree with y'all - I think the analysis is great and hopefully we see more of this.
Bradley Metrock: [01:07:12] So that is This Week In Voice. Lisa, Brian, Ahmed, thank you very very much for joining us and sharing your time with us.
Lisa Falkson: [01:07:21] Thank you for having us.
Brian Roemmele: [01:07:22] Thank you so much.
Brian Roemmele: [01:07:23] Sure. So I greatly appreciate all of your contributions. And for This Week In Voice, Episode 2, July 13th, 2017...thank you for listening. And until next time.