Top news stories for Season 2, Episode 13 (April 26, 2018):
2) Harvard Business Review: Marketing, In The Age Of Alexa
3) Pacific Content: Google has a new podcasting strategy that could "double audiences worldwide"
4) The battle for connected cars heats up
4a) Voicebot.AI Story of the Week: Amazon Echo May Be Coming To A Car Near You
5) Cue the circus music: Facebook pushes back smart speakers amidst "turmoil"
Plus...stay tuned past the end music for another episode of Homie & Lexy!
This Week In Voice available via:
Google Play Music
YouTube (+ closed captioning)
Panel for Season 2, Episode 13 (April 26, 2018):
Theodora Lau is founder of Unconventional Ventures.
Bradley Metrock: [00:00:11] Hi and welcome back to This Week In Voice, Season 2, Episode 13. Today is Thursday, April 26, 2018. My name is Bradley Metrock, I'm CEO of a company called Score Publishing based here in Nashville, Tennessee. Our sponsor for This Week In Voice, as well as the VoiceFirst Roundtable, is St. Louis-based VoiceXP. Like I did last week, I want to take a moment to thank VoiceXP for the work that they're doing in the field. I know that in a couple of weeks they're going to graduate from their accelerator that they're in in St. Louis which is exciting, but I also want to give a special shout out to Bonnie Snyder who, along with Katie McMahon who is one of our guests today, are going to be on the WITI Summit Women of Voice Technology Panel that will be taking place later this year out in San Jose, California. That's pretty exciting, what VoiceXP is doing is exciting. If you need someone to create an Alexa skill for you, look them up at www.voicexp.com. You'll be glad that you did. We've got a phenomenal panel today. This is kind of a treat, a real all-star group. First up is Theo Lau, Theo how are you.
Theo Lau: [00:01:26] I'm good, thank you and thanks for having me.
Bradley Metrock: [00:01:29] Theo is the founder of Unconventional Ventures. Theo, tell us tell us what that is.
Theo Lau: [00:01:37] It is by all means unconventional in a way. The reason why I picked that name is because of the demographics that I'm looking at and the cause I'm looking at, is somewhat different than typically when you think about innovation and technology.
Theo Lau: [00:01:52] I'm very much interested in low-bridging technologies to improve the well-being of the older demographic and particularly interested in female funders and under-represented entrepreneurs.
Bradley Metrock: [00:02:05] Thank you for joining us today and also thank you for allowing me to be part of the Finovate Twitter Chat that just took place. If you're on Twitter look for #FinovateChat and you'll be able to see this great Twitter chat that Theo hosted earlier today. Theo,thank you for joining us.
Theo Lau: [00:02:23] Thank you for having me again.
Bradley Metrock: [00:02:24] Next up is Katie McMahon. Katie, say hello.
Katie McMahon: [00:02:26] Hi there everyone.
Bradley Metrock: [00:02:28] Katie, thank you for joining us. Katie is Vice President and General Manager of SoundHound, Inc. Katie, tell us about SoundHound and all the great stuff you are doing.
Katie McMahon: [00:02:38] Thank you Bradley and particularly to this whole VoiceFirst community. It's exciting to be here and for our happy understanding of what our Houndify voice AI platform has developed and productized and gotten live. Just real quickly, we took 10 years in stealth mode to develop several core technologies that in the end produced a step change in how we can enable a voice interface that is natural, conversational and truly much closer to how the human mind works versus the current standard of entity detection type frameworks. So, I'm glad to be here and look forward to our conversation.
Bradley Metrock: [00:03:18] If you ever have a chance to hear Katie speak, do not miss it. She spoke at the Alexa Conference last year. She speaks frequently on behalf of SoundHound, Inc. and it's just phenomenal. And what SoundHound is doing is phenomenal too. So thank you for joining us Katie.
Katie McMahon: [00:03:35] Thank you for having me.
Bradley Metrock: [00:03:37] Next up we have the one and only VoiceFirst Oracle, Brian Roemmele. Brian, say hello.
Brian Roemmele: [00:03:43] Hello and really excited to be here. Both Katie and Theo are inspirations to me.
Bradley Metrock: [00:03:48] Brian what are you working on right now? Share with us a window into your world and what you've got going on right now.
Brian Roemmele: [00:03:54] I'm working on a lot of sort of stealthy type of projects. What I can talk about is my general frustration and the way some of the advancements are taking place within the VoiceFirst community. I have nothing against what is in the market right now, but I feel we're heading towards a stagnation point in the next 12 months and I'm way in front of that stagnation point. A lot of people are sort of like, hey it's all taking off, and I'm like well you don't see what's coming up ahead. Part of being a pioneer is being up in front of it and seeing where the issues are. A year ago we talked about the privacy issues that Facebook was running into. We haven't begun to really see the privacy issues that we're going to be facing in VoiceFirst and I'm trying to work that out, working with some very large marquee clients. It's taking a tremendous amount my time. Doing it myself I think I'm going to have to start finding people to work with. That's all I talk about at this point.
Bradley Metrock: [00:04:57] That's why we call you the VoiceFirst oracle. We don't expect you to share everything, only when the time is right.
Bradley Metrock: [00:05:02] Thank you all three of you for joining us on this all-star panel. With that, we will get to the news. Story number one is from CNBC and this headline, I don't like the way this headline is written, but we will go with it anyway. Amazon targets kids with a candy colored Echo and a version of Alexa which awards politeness. The headline makes it sound like Amazon's a stranger offering candy to kids that get them in the car or something.
Brian Roemmele: [00:05:32] I'm hearing an ice cream truck coming along with the kids.
Bradley Metrock: [00:05:35] Yeah, they could have written the headline a little bit differently, but the story is a big one, this children-oriented Echo. Theo, I want to start with you. What do you think about this concept of what Amazon has rolled out and share with me how this news and this approach by Amazon sort of strikes you.
Theo Lau: [00:06:16] It's actually fascinating if you think about it. When you guys were talking about ice cream truck and candy, there's this little story when I was growing in Hong Kong that strangers would lure you with little gold fish. It's pretty, it's colorful and it's captivating. By and large that's what Amazon is doing. I have two kids and when I saw this news I actually thought it's kind of cool. We have the Apple HomePod and we have Google Home, but we struck out with it a little bit. In the old days when we have programming, it's a 1 line, 2 line codes but it's grating when you're conversational you're still polite even though when you're programming and typing is abrupt. With all of these virtual assistants, you're barking a command into a device. I do catch myself thinking oh my goodness, should we say please, should we be more polite when we're talking to the little box that's sitting on the counter-top? I think that's nice. I love the fact that it addresses some of the parental concerns like you cannot be ordering things, you can block it with different access hours and you customize the content. So that in by itself is Amazon trying to put the whole experience that kids get on iPads and their other devices nowadays to the voice ring.
Theo Lau: [00:07:27] Now with that being said though, the flip side of it where the cynical side of me is thinking wow they are actually pretty smart because you want to get them young when they get used to it, when they get used to interacting with the Echo because the parents feel it's safe. Well guess what happened when they get older, they basically dominate the market. They're trying to get into all of the different corners of the house. So why not target additional demographics.
Katie McMahon: [00:07:56] First, I want to zoom out and acknowledge that the youngsters today, and I categorize that between those literally who are in gestation through age 5, 6, 7, they will become known as the VoiceFirst generation. I kind of sort of tried to coin them the GenV group. We got our Millennials, we got our GenZ and now we've got this Voice Native coming up because they will have interfaced to things of computing power with their backs to it. Unlike me, iPad generation which got swept up sort of in the touch, type and swipe decade.
Katie McMahon: [00:08:38] So on the high level, youngsters interfacing with the IOT is a fascinating realm to which I think we have yet to see deep thinking across multi disciplines. So while I will forever give Amazon the credit for in essence having their Steve Jobs moment, when Jobs stood on stage and unveiled the iPhone, that became our iconic marker of touch, type and swipe era. The Echo launched quite quietly in becoming an overnight success over the past two years really in gaining the momentum, that cylinder cone will be the iconic device that represented the beginning of VoiceFirst. So, the little people who got introduced to their kitchen counters and speaking were asking to play a song, etc. there's the novelty factor. How do we think very deeply in regards to what does this mean for their development. Literally, their neurological development in the realm of children's education with voice interfacing is huge. I think there's a lot of killer apps there. I know O'Brien has these big thoughts along those lines as well. What Amazon does right here I view as very tactical. As Sari said, she used an interesting word by mentioning port. So it's porting an experience of sorts, but my question will be how deeply and thoroughly thought through is this experience?
Katie McMahon: [00:10:14] Again, that points to the burgeoning field of VUI and the voice user interface designers to which there is going to be a floodgate of need for really great audio designers, those who think first in the need of sound floating around us and how do we engage. I would want us to take a deep pause and make sure that this Gen V crowd growing up speaks like you and I do, which means it's context aware, continuous conversational. Heaven forbid they get stunted into learning speech by barking at a device in NTD detection or robotic or Tarzan speak in order to make this thing work.
Brian Roemmele: [00:10:58] I Believe very, very deeply in the way nature has built our communication modalities. I mean, you're talking about evolution for millions of years to develop our ability to communicate by speech. In the last 70 or 80 years, we somehow believe that typing and that form of communication which is very abrupt, very non-continuous, non-dialogue, really sometimes very non-contextual type of communication, is the future and that is absolutely wrong. Humanity is actually not going to survive if that's how we communicate. We're not going to peer bond, we're not going to reproduce. That's not going to happen. So over time what's going to happen is people are going to get frustrated with these modes of communication.
Brian Roemmele: [00:11:47] The Technology as it exists today is available. What we're seeing going on with Hound and some other companies that are in stealth mode where you can have what I'm doing in my garage. I mean I'm just a guy with a piggy bank in my garage with raspberry pies and I have continuity conversations for 35, 40 minutes where it's constantly kind of laying the railroad track as it's going down the valley, if you will, in the conversation. So it's not a technology challenge if I can do this stuff with my ugly coding. It's more the willpower and the people who are commanding the direction of this technology.
Brian Roemmele: [00:12:25] Again, it's not a slight on these technology companies that are doing it, I think they're doing wonderfully. When Steve Jobs, like Katie said, held up an iPhone there was also a demarcation point in a number of ways. Not only were artists brought in by Steve with the original user interface with a mouse and desktop metaphor, artists of all types who are creating new forms of software that you would not have seen come out from somebody who just was a CS student. I have a deep technology background and I'm not putting that down, but I also really respect the creativity that's going to be required to make these dialogues and conversations and continuity reactions really rich and robust.
Brian Roemmele: [00:13:14] Now let's get down to children. Children are sponges. All of us have children here on this show, and when you observe how they learn, they're learning by their environment. Like Katie said, barking into a device, because it's not a person it's a machine, that's a rote behavior. I happen to like the work of Bandler and Grinder. They developed something called an NLP, Neuro Linguistic Programming. When you understand the technology behind that, and some people say it's discredited, I see it work in real life. We learn by a certain behavioral pattern within our mind, especially children. It actually does not take that much to tilt a child in another direction. It is sort of a state. When you are yelling or you're trying to put something out there abruptly in very short, concise words it's abrasive in every culture. You wind up saying, why is this person talking this way? Well, it carries over into conversation.
Brian Roemmele: [00:14:24] Having voice devices in my house now for over a decade and Alexa devices since 2014, children always say please and thank you. Whether or not there is a prompt in there for it, it's just a matter of an LP, it's a matter of just being a parent and trying to enforce that. You can't always enforce that when they're around kids that don't have that belief system or don't have that reinforcement. It is amazing that Amazon's walking down this path. The thing about it, it's just scratching the surface. It's not about the rich content. It's about this becoming a lifelong buddy to this individual. We're seeing the rise of the voice generation and they are going to see these assistants as something inseparable from them at some point in time. We're just getting to the precipice of that with this continuity and the work that some of these companies are doing where you're actually having conversation, we're actually having something rich and rewarding where you're not just getting question and answer, Q and A back and forth.
Theo Lau: [00:15:27] I think what is fascinating with all of this voice technology is also it provides another avenue for kids that are not as literate. They don't quite know how to get their way around computers and screens and typing. They might not even know how to read. Yet, this provides a way for them to learn enough to get knowledge and that's where it becomes fascinating. It opens up the possibility for technology to be inclusive.
Brian Roemmele: [00:15:57] This is the revolution. This is what it's about and it's heartbreaking to see the direction of education where we're not even doing cursive writing in public schools in California anymore which is kind of mind blowing to me. A lot of people say well it doesn't matter anymore. There are psychological reasons that we did certain things in the educational system and a lot of it is being brushed aside. Literacy and access to technology and the ability to get to points of information that you actually couldn't have gotten in a Google search. I've watched my children interact. Even with the limited tools that are available on the voice technology that we have today, let's just say Echo's sitting in the kitchen and drilled down into you know a scientific question about why is the sun this color or why is the sky blue. I've seen these conversations with my children and it's phenomenal. These are things they would not have typed in a long chain and I've actually experimented, not just with my children, but I've done a couple of studies for a few clients where we worked with some children to try to see how far there is an edge in asking questions. We have not found the bottom. Most children will continue to talk and ask questions until they reach the limit where the kind of feel that, but they don't stop. They kind of go on a tangent, but they don't do that in Google.
Brian Roemmele: [00:17:23] So we're actually seeing another modality, do all these children have the ability to type all these words out one character at a time. Are they literate enough? Again, it's coming back to where they have to read it and it's not summarized in the proper way. The beauty of the technology I'm talking about is you're not just using your voice, that's obviously apparent. What I'm talking about is the AI and the summarization contextual to you. How old are you, what are you really looking for, do you need facts. Most people don't need facts. They need a feel of information and a lot of people don't like that but examine your own life. Do you really need to know exactly how many minutes something is going to take down to the millisecond? No, you want to feel is it going to be a half an hour or 45 minutes. That's how humans really interact. We only got the fax because computers need real hard numbers and we're still living in that realm. In the AI world that we're going to be going into that's voice mediated, we're going to be getting more contextual feels about information and that's where children really thrive because they dip their hands into knowledge and they just look at it and say OK I love this and they dive deep or they skip and they move on to something else. That to me is what true education is about.
Bradley Metrock: [00:18:40] Privately, there is a reason we brought you on the show. Katie, I'm going to shift back to you in just a second, but I want to rope in the second story here because this could very much connected to what we're talking about.
Bradley Metrock: [00:18:56] The second story is from Harvard Business Review, it's called marketing and the age of Alexa. I found this story to be one of the most eye-opening ones I've read since we've started this show. Katie, I want to go back to you and keep this discussion rolling along the lines with what Brian was talking about how kids will grow up becoming accustomed to having, I guess what you would call a relationship with a virtual assistant, where the assistant knows your context and it knows to a very deep extent things about you. This article walks through that. Share with me your thoughts on that story from the Harvard Business Review.
Katie McMahon: [00:19:39] It sets up a context of Eve and your assistant that follows you and in some ways is out ahead of your next needs. So, I think we all can envision a world whereby we have a personalized assistant to some degree, whether that is primarily on your current device, the phone but then can transition into the auto and can move about to your home. Coming back to the data sets that belie all of this, there are ultimately inherent biases and I think often about what would that mean when the assistant has been with you, does that mean the end of spontaneity? Does that mean the end of random nerdy moments of sorts because the biases of the data, let alone the personalized, it's a flywheel of sorts. So that's just one nugget that we can chew on and very specifically about marketing and the age of Alexa. This is where we're just at the brink of the creatives coming into the space.
Katie McMahon: [00:20:52] Two years ago when we unveiled a mobile first platform and went up and down Madison Avenue, saying like hey guys this is going to be amazing. If you want to represent your brands and take ownership have your own wake word. Let's call your thing, your thing, not use Amazon's wake word for the richness of your own experience. You know your users, you're best positioned for that stuff. That message fell on completely deaf ears in that they didn't know yet or they hadn't seen or gotten it.
Katie McMahon: [00:21:23] Fast forward to today, there are now dedicated conferences. NBC Universal put on a big event really awakening, if you will, the creative and the advertising community to what's coming down the pike. So, I say we're still very early in raw marketing via the mechanism of voice interface. I feel very strongly that it will require all of the tools to enable the end service owner that product, that experienced owner, to own their own customer, own their own data, own their brand words and feel. That's non-trivial on voice, whereas back in Mobile when the SDK on IOS got released the floodgates of creativity opened wider than anyone could have foreseen. Within a month one developer in a garage in Malmo made a game that took off and actually became a millionaire. Then very quickly the major ESPN the EAs the NLBs realized, wow we need this talent and we need to be able to produce something extraordinary. They had all the access via the distribution of app stores, but they still owned it all because that could be done on those platforms where you are leveraging the accelerometer, the microphone, the graphical interface.
Katie McMahon: [00:22:53] My voice is an entirely different bucket for tools and you can't really spend ten years finding the chief scientist that knows acoustic modeling and a computational scientist who can then help marry between a data set of certain languages and turn it into ASR and also add NLU. It's so much more complicated in that the tools for a developer, let alone getting the creatives into the field. You haven't truly been there and now that there is women in this field we're certainly getting more slam brands who are waking up and saying wow you can allow us to own our wake word as well as our own, really the TTS, the text to speech that's an auditorial experience. We know what Siri sounds like, we know what Alexa sounds like and what about you, Brand X that has 80 million users on either mobile or an in-car, do you want to sound like those iconic voices of major companies to which you might feel slightly nervous about? So really, how do we make other people's visions come true? I think that's where marketing and those thinkers of the creative fields are about to have a heyday.
Theo Lau: [00:24:17] I love your last line, how do we make other people's marketing dreams come true? In all honesty, that's where the future needs to be. It needs to be about connecting people, connecting family. It's about effortless, it's about walking into an environment where we're seeing that things are done. You don't have to think about it, that will ultimately be the dream. It's boring like a scene from an old Disney movie that came about a while back, Woolley, where you have the robots running around but take it one step further. Now you have little Echo devices or little voice devices running around. Why do I have to think about doing something when you should already know me as a consumer? You know my preferences, you know what I do every day on the day to day basis, you know where my appointments are, you know where I'm eating, you know what I need to buy from a groceries perspective. You know when I need to pay my bills. A lot of these things are just data points. With technology, with computation of power, with AI all of this can be done effortlessly theoretically speaking and the background for us. I'm thinking that's what it should be about. We shouldn't have to think about banking because of something that happens in the background. We shouldn't have to think about a lot of these things that seemingly is much easier to do, but it's still an effort. So, that that will be where I think and I hope where we will be heading in the future.
Brian Roemmele: [00:25:52] Wonderful answers here too, and I go back to this. The most wonderful moments in most human beings lives are serendipitous and they're novelties. They were not planned, they were not programmed. They were not things that we thought we were getting to our destination about. It was a journey on that destination and I invite anybody listening to look at their own lives and say, were those wonderful moments were they really going to Hawaii? Maybe, but was it the trip? Maybe along the way on the trail you met somebody or maybe at a restaurant. What I'm trying to say is when you are programming an assistant, I'm trying to show you how hard it is for me with the data science world that we're in right now versus where this has got to go. Let's call it the graphics artist world of voice. Serendipity can be programmed and it can be programmed based on higher and higher levels of context about that person. Katie brought that up and it's beautiful what you said, I'm afraid that sort of novelty won't show up because everything is going to kind of just happen and these biases and things of that level. When we come back to advertising and marketing which is ultimately the monetization unit in voice is going to be commerce, period end of story. There's not going to be advertising and marketing in the way we think. We're going to have to go through the training wheels era, which is going to be the next three to five years, where everybody is going to test all these different things.
Brian Roemmele: [00:27:21] Unfortunately, I can tell you where it's going to wind up and that's going to be that people are just going to push back. The Facebook effect that we've seen the last few months maybe is precipitating that a little faster because people are going to start having the dialogue that Bradley you and I have talked about forever about what privacy really is. I say that is the defining sort of discussion of this generation is, what do you really do with privacy? Is your data, is your personality, is your context going to be held in the cloud? Is it going to be on some universal device called serial active scanner, or something like that, or is it going to be more to you?
Brian Roemmele: [00:27:56] So, novelty and serendipity must be programmed into the system. How do you do that. Hire me and I'll show you how to do it because it is not all that difficult. I'll tell you where you're not going to get it. You're not going to get it in traditional machine learning. You're going to get into going back to the late 1700s to maybe the early 1900s and understanding some of the great philosophers, the great psychologists, understanding Carl Young, even Freud to a certain level, and understanding Myers Briggs. Every time I bring it up it's still new to people. It's like if you're building a personality within a device then you need to really understand some guiding principles of how humans interact. Unfortunately, people programming it may have the least interactions with human beings. They're in a cubicle programming most of the time and they're saying, hey boss I'm going to program this really great interaction. Is it really great? Maybe it is for that individual, but there is a variety of human beings that walk on the earth and a lot of them are not technologists and they don't talk in very abrupt sort of manners.
Brian Roemmele: [00:28:58] Now, getting into marketing. The opportunity for commerce inside, and that's really what the Harvard Business Review is talking about. I might say I might have been at least indirectly interviewed for this article, let's put it that way. The opportunity for marketing and the opportunity, like Katie said, for a brand to own what their logo looks like in a VoiceFirst world. You go to a corporation. I'm doing this pretty much every moment of the day. I'm speaking with large corporations and like Katie said, two or three years ago what are you talking about with voice. Now, they're freaking out and the problem is there's very few people to talk to. It's either let me develop you a skill for your business, which is great, which is like putting your business card on the Internet. In the year 2000, ok big business what are you going to do? I'll put my business card up there. Obviously, that was not the answer. That was you know a web developer, a few lines of ATMO code.
Brian Roemmele: [00:29:57] That's kind of where we are right now. I'm not putting any of that down. That is a necessary step, but if you don't have a plan, if you are a major brand or if you're working at one of these large companies who are putting out voice products and you're not thinking about this on a holistic level, why is a company coming to this? Is it a to reach their customers, yeah, then what? Well, I'll make it fun and entertaining, yeah, then what? All of those sorts of answers come to a dead end and then you reach stagnation level and that's kind of what I alluded to early on.
Brian Roemmele: [00:30:31] The next step for a brand is to take a deeper look into it and take a lot of steps back and say, we spent how many millions or billions of dollars to localfy and brand our company. We're going to use the default voice, the default architecture, the default personality type that comes along with these systems. We get to use laughs every now and then or we can use irony in all these really crude tools. We can make it sort of snappy sounding and we can kind of make it sound like they're a little ironic or something. None of that works unless you go back to the human psychology that dictate what those personality types are.
Brian Roemmele: [00:31:14] Here's what happens in the voiceless world, we don't see your logo. Sure we might have mixed modalities and we might see it, we don't see your logo. We might not necessarily see your product, and here's the killer. We may not brand specify. We may say paper towels, we may not say Downy. We may say get me a hamburger and we may not say a specific organic, non-grain-fed range beef, whatever. Now where does that come in? That comes in to the real power of this and that's the commerce engine. That is, it's the context if this individual's high enough, there is serendipity to say hey there's a new restaurant that has organic, grain fed burgers or Whole Foods now can deliver you via Amazon's drone. This organic hamburger, grain-fed or whatever you want, but I'd say grass-fed is probably better, or you can get whatever you'd normally like.
Brian Roemmele: [00:32:16] That's the marketing we're talking about. It's not well this person bought this once before, maybe they'll buy that. That's a very crude attempt. How this is working is more on the personalization side and your agent, your local assistant is going out finding this stuff. It's not being pushed at them. This is that slight shift that most people are not going to recognize, but it's going to probably be the shift we need for the privacy issue. Advertising will become pull and not push. We will demand certain types of things through our agents rather than having it pushed at us. My answer to this is that biases are going to be there, but you're going to also have serendipitous and novelty coming into it. That will make it ever more, unfortunately and maybe fortunately, addictive for the user because you don't want to leave that space once you get that capability.
Bradley Metrock: [00:33:12] That's a phenomenal conversation. I'm going to leave that right where it is. We're going to move on to story number three. This is interesting. It's a five part story we've linked to the first part in the news of the week. Google has a new podcasting strategy that could double audiences worldwide. This is kind of interesting. Brian, I want to start with you. We've talked before over the course of last year, over different episodes of this show, about Amazon versus Google and the back and forth. I want to get your thoughts on this story, but I also want to get your thoughts in general about where Google sits right now.
Bradley Metrock: [00:33:53] I'm going to ask the other the rest of the panel this question to. Give me your thoughts on the story but also give me your thoughts on where Google sits right now competitively in the voice assistant, smart speaker space relative to Amazon.
Brian Roemmele: [00:34:09] That's a great question Bradley. Podcasts or what were podcasts and now are going to be essentially VoiceFirst casts, if you really want to look at how these things are going to be, are going to become an incredible opportunity. If we thought the podcast and the desktop publishing year and the blogging year was a form of expression, we're going to see that explode. With some intelligence, the system will be able to find very unique content over time that will be brought to you that really makes sense. There's going to be near field and far field sort of interactions. At this point it's more or less AirPods are doing near field and we don't want to go down the apple dead end that they are with voice right now. A lot of the podcast consumption is going to be near field, it's going to be on their way to do something. One of the subject matters we can talk about later is about in-car, in the automobile. There's a tremendous opportunity to bring you summarizations of podcasts, full podcasts and things of that nature. How is Google doing? I think they're moving very quickly. I think they're moving faster than most people recognize, but unfortunately I don't think Google's approaching the market as nearly as efficiently as they could.
Brian Roemmele: [00:35:34] Certainly, not as bad as Apple is and we could talk hours, and we have. The very first conversation I had with you was about Apple's failure to recognize that voice is its own thing. They had so many apologists out there saying oh apple, attaboy, you don't need to get involved and that everybody is going to gesture forever. It's like, yeah, everybody's going to be on a teletype screen forever too. We have to start seeing that it's not just about the horse power, what I call the electricity of AI and that is recognition and all that. That's going to become pretty cheap and fairly available. It's going to be what you're doing with that data, how you're protecting the user's privacy and how you're interacting with that user. That gives Apple an edge, gives them an opportunity to sort of leapfrog. Google has an opportunity to leapfrog Amazon right at this moment, but they don't I think they strategy to do that, They're certainly hiring some amazing people and I'm forming a prayer circle that they utilize their talent in the best possible way.
Theo Lau: [00:36:35] That last comment was funny, Brian. What intrigues me the most about the potential of it, and you know we talked a little bit about it with the last news stories to personalization of it. I'm an assistant podcast mostly when I'm in transit, when I'm in a car when I'm moving around. I would love to be able to have a seamless experience. It doesn't matter if I'm in the car and I'm listening to it in the in-car audio. Then I move up to the Metro or migrate to my phone and then I get home and it will continuously playback on a device at home. Why not, it would be great. All of these devices should know me. They should know where I am. I would love to have a seamless experience across, it doesn't matter what device it is. It should not be limited to where you are. That would be what I would look for. Then something else, Brian that you talked about that intrigued me. What if we have one smart AI to control it all, like how Lord Of The Rings one blink of an eye. What if there's one smart assistant that they can just talk to each other, they can sort things out. Then they can just make our lives easier. Why does it have to be up to the consumer? Figure out what technology am I using, what device am I interacting with?
Brian Roemmele: [00:37:56] I fully agree, but the thing is, I don't mind the AI itself. I mind where the contextual data is going to be stored for the integrity of the human beings on this planet. I don't want to see everybody's content stored in the cloud available for somebody to peruse and to do a pay per click advertising routine on it.
Katie McMahon: [00:38:18] Remember, that the amount of data to ultimately power seamless, fluid conversations where you could be asking about what is the nearest restaurant that has outdoor seating and is open right now? Oh, it's in San Diego. Well, what's the weather like there? That ability to have that local search and in the weather clicking and every other set of what we call domains. That is an architectural feat to do that. Currently, the big guys really have their closed ecosystems. We've taken a position of, hey partner, if we're going to work really hard and put up the Yelp data set for example so that you can interface naturally. To be really refining in your request that you want to go for an Asian lunch, but not Chinese or Thai. You can do that but wouldn't it be nice that the next developer that has some cool idea doesn't have to go to Yelp and spend six months in a BD cycle to get through to that. So, wouldn't it be nice that there's a repository where there are all these sets of domains that actually are interoperable and they work together. If that data set owner chooses not to make it extensible, they can keep it private. That's sort of our concept behind the collective AI. That the acceleration point of new developers coming onboard ready for developing for their robot, or their drone, or their refrigerator, or coffee machine unlocked is already hundreds of Domain sets versus just starting with one or two. The data point there is to look back to what Siri launched with. It was a handful, maybe six or seven domains by that we mean weather, sports, ability to call contacts. That's part and parcel today as a starting point. Why did they only have really 25, 26 domains today and then ship a product that doesn't even match that on iPhone experience.
Katie McMahon: [00:40:22] I think the take home message is, the voice expectation. We've all known the future when you envision far out we'll be talking to these robots. That we're not going to target speak and we're not going to call each thing Alexa. So even the greatest product marketing genius horsepower of the current technology world, failed because you couldn't just product market and say it's got acoustics and the music experience is amazing. The bar is going to be set in a voice interfacing level and once it's set anything that is less than that bar feels rusty, creaky, old and really should just be deprecated. Like boom, that's my point.
Brian Roemmele: [00:41:03] Katie, I've got to add on to that and it's absolutely brilliant. I call this neuron's because the human mind doesn't sanction all the different knowledge sets it has. There are neuronal connections between every memory, everything you've ever learned. When we're building these new skills of the future, they're going to become interdependent upon each other. That's exactly what the whole Houndify concept is. If we don't work in that direction and understand that there are interdependencies, monetization is going to fall apart. The biggest challenge is how do you monetize that? What if there's 97 interdependencies and if one pulls out all of the other interdependencies fall apart. Who's going to make that all work? Who's going to compensate all those developers?
Brian Roemmele: [00:41:51] I've spent the better part of 35 years thinking about that and everybody's moving in a direction where they're literally in the opposite direction to be able to make that developer ecosystem, including Apple. Apple had the opportunity to do that with the technology they acquired. The bottom line is, that's the expectation level and you have this amazing technology company that could have done something great and even they missed the mark. I'm telling you what it comes down to is leadership. It comes down to vision, and it comes down to people not wanting to take a risk.
Bradley Metrock: [00:42:25] Katie, I don't know how Apple has not acquired SoundHound, Inc. They would be much further ahead of the game if they did. Brian, I'm sure you would agree. I don't understand it.
Brian Roemmele: [00:42:33] It's the most disgusting thing on the planet, Apple not acquiring incredible technology. What is wrong, I don't know.
Bradley Metrock: [00:42:42] We'll save that, but a fantastic discussion from all three of you and of course awesome points. I'm going to call an audible here. We've got a story about number four which has connected cars. I want to get Katie to sound off on that. It's Something that SoundHound, Inc. has been part of. Then I'll come back to the panel for commentary on story number five.
Bradley Metrock: [00:43:03] Story number four, the battle for connected cars is heating up. There's two parts to this. One is the VoiceBox AI story of the week which is that Amazon Echo may be coming to a car near you. Story four B is related except it's elsewhere in the world. European automakers are adding Alibaba voice assistants to cars. Katie, I want to get your thoughts on this since we're running out of time, but share with us your thoughts on these stories sort of in aggregate. What is it that the layperson like me, or someone listening to the show, needs to know about connected cars? What do we need to take away with what's going on with connected cars? Why should we be excited about connected cars?
Katie McMahon: [00:43:47] Yeah, it's fantastic. The connected car revolution is going to enable mass market engagement with the cloud and be able to stream real time, be able to access information, news and navigation very seamlessly. So, that's just basic step 1 of what connected cars do for you. What does this mean for the industry? This is an industry that has technically had some degree of quote "voice" in in their vehicles, but often to a laughing stock of an experience. It didn't really work or it was very segmented and in a very specific use case. In the auto industry there's is such a resurgence. I think five years ago if somebody said I'm going to go work for auto company X there be a lot of smug looks at somebody, particularly in Silicon Valley, they'd say, really? Today, it's one of the hottest fields to be moving into because the connected car represents services. You've got chief digital officers wielding a lot say now because again, it comes back to experience. So what does voice mean very near term for autos? There's going to be a continuation of, let's tether into my experience of car play or Android Auto app. I'll use my device to quote "be my connection". So that's our little half step towards a fully connected world.
Katie McMahon: [00:45:16] The next small step is those automakers might still feel they have to leverage the big giants. I can tell you from our conversations, and we work with nearly every auto, the one that we've been able to publicly announce that was Hyundai at CES showcased their future vision of the houndified, in-car cockpit experience whereby you enter the car and it understands your schedule. You can seamlessly navigate by saying, take me to the nearest coffee shop, except Starbucks technical note handily for exclusions is very challenging. If you asked Siri what's the nearest restaurant except Chinese, it's going to be exactly what you didn't want. That goes off of the mechanism of NTD detection of an ARS step, then an all you versus a combined, much faster robust speech technology. That experience of future that the autos are looking at is why they're very attractive to, how do I own my own literal voice, how do I have the connection to the car. Remember, there's command and control, put down the window, turn on the lights. These sorts of things require deep embedding and that's an area that we work really deeply with the car to, and in some cases deliver, an embedded library that is able to do things offline. It's really critical that if you're stuck in a tunnel, you want to be able to say put down the window in the lower rear right car seat. Boom, you can have all that done. Auto is the tipping spear for voice because it is safety first and foremost.
Katie McMahon: [00:46:56] My learning curve on autos is I was so struck when I read that the designers of autos always think first they're designing something that can kill, they're designing something that had that responsibility. It can take out lives. So when you think back to that premise of it's not always just the sexy sleek curves, extraordinary or functionality of an SUV, it's better than the design element. I think it has such a strong history across the automotive community that maybe that's also why they are likewise a step ahead. They need it for voice first, hands-free and then they know it will be a massive differentiation.
Bradley Metrock: [00:47:37] That's phenomenal, thank you for that insight and I greatly appreciate that. We will move on to story number five. Any time the words, cue the circus music, preceded an article from This Week In Voice you have done something wrong. Facebook pushes back smart speakers amidst turmoil and, Theo, I want to start with you and then I'm going to ask Brian then asked Katie to conclude us. I'm going to ask you a simple question, should Facebook be trying to create a smart speaker?
Theo Lau: [00:48:14] No, when I saw that I am like why are we doing it. I don't understand it. They already had enough going on already. If you think about Facebook and funny that one of the biggest demographic of Facebook users are actually people that are older than 45. We all love to get on Facebook to look at and to post kids’ pictures, grandkids’ pictures and talking about our children and what not and yes, political opinion now. If you think about that, a lot of this is visual. It's really not so much from a voice perspective or maybe I'm not visionary enough, and not to mention all of that data and privacy concerns that's going on there.
Bradley Metrock: [00:49:15] Brian, should Facebook be creating a smart speaker?
Brian Roemmele: [00:49:18] Yes and no, and again I don't want to beat the drum of leadership. You don't take somebody who has very little talent and understanding the future of this technology and say move fast, break things and figure this out. Especially in the backdrop of people waking up, not that it hasn't been there, but waking up to not just what is going on at Facebook. I think the big wake-up call is when it's Google's turn. It's going to, and we talked about this a year ago, when people start recognizing just how much data they given up to get something for free, the pendulum is going to swing. I believe in productivity. I believe that a declaration of privacy must be declared by these companies today. Ask me, I'll do it for free. You need to do this. If you don't do it it will be done for you in a very painful manner with congressional hearings and senatorial campaigns running on the back of it. In this year, we're going to see the other shoe drop and that's going to be that a lot more data is out there than people recognize and it's being used in ways that they would have never have thought possible and people are freaking out. In that backdrop we're putting another system in somebody's home that's listening to him all the time, keeping high contextual data and we're kind of willy nilly throwing it out there as just a portal into the Facebook experience. If you do that, don't waste your time.
Brian Roemmele: [00:50:50] If you want to raise the bar, if you want to raise the standard, maybe reach out into the world of people who can't pass the Google test and go into that world of people who have actually thought about this. I'm not just talking about myself. I always am, but I'm talking about a lot of people out there that thought about how these things are going to transform in life. With Facebook data they you know your family, they know your family structure and that contextual data is extremely important. They know your friends, what you like, what you dislike. A lot of information that you could not even got out of a Google relationship. So it is potentially extremely useful for the contextual interactions I talked about earlier in the show. If it's not done the right way it will be creepy, it will be ultimately rejected and I think it was a good move for them to postpone it.
Bradley Metrock: [00:51:40] Yeah that's great, but I don't think the words Facebook and raise the bar belong in the same sentence but we'll see what they manage to do.
Brian Roemmele: [00:51:48] Again, I'm hoping for the best here. I mean I'm hoping these technology companies start waking up and some of the arrogance is kind of wiped off but we'll see. You know I'm puzzled.
Bradley Metrock: [00:51:59] Katie, should Facebook be creating a smart speaker?
Katie McMahon: [00:52:02] Facebook has an urgency right now to educate every user that they are the content of their service. So while we in the industry have understood that from the very beginning, it's now clear that the vast majority of people just actually weren't really cognizant of what lies beneath.
Katie McMahon: [00:52:26] Number one, they should be coming out with two bullet points terms of service. You are the content, nothing will be private, and/or bifurcate and have a subscription service which does not do advertising. Or change the whole model whereby they say we will send you a check every month. It's very clear you're part of the content that makes us sell so well, therefore we'll send you a check every month. The urgency around Facebook's role and responsibility is huge and building out hardware and software for voice interface should be a project on that campus. They would be remiss if they aren't working on it, or they weren't ready to go. Obviously, we know this is their decision and right to hold it back in the here and now. From the position of technology innovation being where the puck is going, of course they should be thinking through what does our business mean in a voice interface world. They we be remiss if they aren't, but the urgency in front of them is where the focus needs to be on solving that.
Bradley Metrock: [00:53:42] Excellent and great commentary.
Brian Roemmele: [00:53:44] I'm going to ask you Bradley, how do you how do you feel about this idea?
Bradley Metrock: [00:53:49] About Facebook creating a smart speaker? If you want my opinion, of course they shouldn't be doing that. I agree with Katie there. I mean in really the nuance that she was saying that Facebook should build a smart speaker is like saying, I have to be careful what I'm saying. I have to be careful what metaphor I use here. I mean saying that Facebook should build a smart speaker is like saying my 6 year old son needs to be worrying about where he's going to go to college. We're many, many standard deviations away from that. Mini increments of time away from that and there's much more urgent things that have to happen first before we ever get to that point. Facebook's in trouble. People like to look at their financials and things, but people lose sight of the fact that those are often lagging indicators of what's really going on. Apple being a great example, as well, of sort of the decay and the rot from inside.
Bradley Metrock: If Facebook were around 10 years ago, they'd be gone because they would have gone through something like what MySpace went through where literally everybody just left overnight just as quickly as they arrived. Then you had to have something left that's a shell of its former self. Facebook did something smart. They went out and made some key acquisitions. They bought Instagram for one of the deals that people think is one of the best tech deals in history. They bought Oculus, they've got WhatsApp and those are the things that are keeping them sort of stable even while people are revolting in the way that seems to be this cycle of social media where we embrace it and then all of a sudden we wake up and we're like what did we do, what did we agree to. Then everybody leaves us and so these things that they bought are keeping people in place so that the same MySpace exit is not possible right now. Yeah, they've got bigger fish to fry than creating something that's going to be worse than Amazon and Google.
Brian Roemmele: [00:56:07] One more question on this since we're on the Facebook thing and it's always been on my mind. The moment you become an editor, the moment you say that I can edit certain information, I could take down information, you're now making a slippery slope on the world stage. That means that any tin plated dictator can call you up and say I don't like what some of my citizens are saying take it down or we're going to shut down your service in our country. Doesn't the panel think that at some point these social media platforms if they continue to get proactively and editing whatever is deemed as fake in the current vernacular, because you know in a republic democracy there's free speech. Some people may call something fake. It was fake news to say that there were microbes on your fingers to doctors when they were performing surgery in the 1800's. It was fake news, but nobody could prove it. If you're gonna go touch an infection and then deliver a baby how dare you tell me to wash my hands. What I'm saying, obviously, there are other things politically based that are obviously designed to intrigue.
Brian Roemmele: [00:57:14] Now we're hiring more and more and more and more editors. This is also going to play into our VoiceFirst world. What sort of information, freedom is going to exist in the wake of all this? How is that going to mediate our future when we're discussing with our personal assistant are we getting pristine information, is really ours or is it somebody who's anesthetizing it and sterilizing it according to what part of the world you're in? Does that make sense?
Bradley Metrock: [00:57:42] Yeah, and I think Facebook did the world a favor. They have woken up a whole lot more people. When MySpace went away you didn't have senior citizens using MySpace. You had a much more limited group of people. Now, Facebook and this whole thing with them and the ebb and flow of social media, it affects everybody and so everybody is caught up in this privacy conversation. I tell you what those congressional hearings where Zuckerberg was in front of all those people answering questions, that's going to go down in history because that is such an incredible example of a lot of the problems that are at work with this. You know my opinion is that Facebook is going to continue to erode until they have to make fundamental changes in the structure of the company. I think what will happen, and it's in line with what Theo, Katie and you Brian have said, is that data has got to be managed properly. I think one of the takeaways from the Facebook era eventually will be that the way, the genesis of the company, is extremely important. We can't have a social media empire that becomes this permanent fixture in our constellation that was created by a college student looking to hook up with women. Rather, it's got to have a different DNA structure from the start and hopefully that's the takeaway from this. If it's not, then we really wasted a lot of time here because we'll just relive the same thing again.
Theo Lau: [00:59:18] I think voice is fascinating, voice acknowledging, what I like about it is that it's something that can transcend demographics, that can transcend cultures, it can transcend a lot of things which I think is why Facebook at its premise is about connecting people and about connecting culture. That's why it was successful. That's what we love it. That's why grandmas love it. With that being said, privacy data and the part of me is, I also wonder from a consumer's perspective what is their trade-off? How much are they willing to give up in return for personalization, in return for convenience. I don't think that line is quite clear yet which is why we are where we are right now. We've never lived through an era of massive economy of scale of the giants. I do wonder if what is happening with Facebook actually puts a crack in those economies of scale that enables the rays of light for potentially hugely, consequential event startups to break through and become the companies of consequence for the next paradigm of human interfacing with computing power.
Bradley Metrock: [01:00:43] The good news is that it's that three of y'all who are going to be so integrally involved in making sure that the world gets to where it needs to be with voice technology. Katie, Theo, Brian thank you all very much for joining me today.
Bradley Metrock: [01:00:57] I was honored to be here today.
Brian Roemmele: [01:01:01] For This Week In Voice, Season 2, Episode 13 thank you for listening and until next time.