Mickey Mouse is easily the most recognizable cartoon character of all time. Since his creation, Mickey has starred in thousands of movies, shorts and cartoons in a hundred different roles and incarnations.
Through it all, audiences flocked in throngs, drawn in by the high quality animation and production quality that was far beyond other cartoons of his day, but also by his bright and vivid personality.
With Siri, its voice activated assistant for the iPhone 4S, Apple is moving beyond simple voice control and into creating a personality, a way for people to interact with their devices on a level beyond poke and swipe.
In doing so, it has presented itself with a substantial task. Moving from text or touch input to voice poses incredible technical challenges. Accurately interpreting the spoken word and translating it into text has eluded many of the smartest minds in computer science for years, only recently becoming viable.
But far more interesting than the hurdles in accurate speech-to-text is how Apple is attempting to change the way that we feel about our devices. For the first time, our phones are emoting to us, and we’re being asked to do the same to them.
What is Siri?
In my review of the iPhone 4S I covered Siri’s basics and its creation, if you’ve read that you can move on, but otherwise some scene setting is in order.
Apple’s Siri assistant is a new feature of the iPhone 4S that replaces the never-great Voice Control. To activate it, you hold down on the home button until the Siri icon appears and you speak into it.
You use natural speech to communicate with it, as you would with a personal assistant. If you would like Siri to make an appointment for you, for instance, you might say “Siri, please make me an appointment for Tuesday at 3 o’clock with Tom Wilson.” Siri will set up a new event on your calendar for that date and time and link it with a contact in your address book that matches that name.
This kind of contextual logic is exactly what has been missing from voice control on phones. The ability to speak as you would to a human and not ave to reconfigure your speech patterns and thought processes to speak to a computer.
Apple didn’t create Siri or the technology that helps it be so scarily accurate. Siri was actually an app on the App Store that Apple acquired in 2010 and the voice interpretation technology is powered by Nuance, although you won’t find any references to it on Apple’s websites.
The Nuance deals have been in place for some time, although Siri was kept mostly a secret until a few weeks before the iPhone 4S launch. The deal to purchase Siri went through in 2010 and since then, Apple has been tweaking and tuning Siri to work exactly as it wants it to.
Siri has its roots in a project called Cognitive Assistant that Learns and Organizes, which was funded in part by DARPA. Members of the project as well as employees of SRI International, who coordinated the project, split off and formed Siri shortly thereafter. SRI’s Norman Winarsky called Siri a “world changing event” in an interview with Seth Weintraub of 9to5Mac. “Right now a few people dabble in partial AI enabled apps like Google Voice Actions, Vlingo or Nuance Go,” said Winarsky. “Siri was many iterations ahead of these technologies, or at least it was two years ago. This is real AI with real market use…Apple will enable millions upon millions of people to interact with machines with natural language,” adding, “we’re talking another technology revolution. A new computing paradigm shift.”
Whether Winarsky’s predictions will come true remain to be seen, but it is clear that Siri works on a level far beyond that of most voice control apps or services.
Part of Siri’s usefulness on the iPhone is the fact that, unlike most native apps including the original Siri app, it has unfettered access to the system apps and OS. This means that it can leverage your contacts, calendars, text message system and Apple’s new Reminders app in ways that would never be allowed by a third-party app.
But simply processing requests logically and accurately isn’t the real strength of Siri, it’s the way that it communicates with users that is the key.
Mickey Mouse and Donald Duck
Walt Disney, either on a train ride from the East Coast—or shortly after appropriating a rough character idea from collaborator Ub Iwerks, depending on who you ask—created Mickey Mouse, a plucky rodent with spindly limbs and a bit of a mean streak. In his initial incarnation, Mickey thought nothing of abusing a few animals and taking revenge on his enemies.
As time moved on, Mickey’s rough edges were softened, along with his nose and ears, but what remained was his can-do attitude, a pervasive sense of optimism that hard work would win out in the end. This resonated with depression-era audiences looking for an escape and also for hope that things would get better if they worked hard enough.
It wasn’t always this way. In earlier shorts produced by Walt’s studio Mickey was a main character, but he was also simply a tool to be used for a laugh. There was very little sense of self-awareness or consistency on display and many of his appearances were just strings of physical gag humor.
Walt continued to refine the character in his famous story sessions, where he would act out the parts of all of the cartoon’s cast to his animators and help to define what those characters would or wouldn’t do according to their nature. As attention was paid to Mickey, he grew to be more of a wholesome front man for everything Disney, and in doing so lost some of his mischievous spark.
This softening of the character came in time to force Disney to introduce Donald Duck as an irritable, quasi-malicious foil to Mickey’s good-natured charm.
Donald eventually grew more popular even than Mickey Mouse, likely due to the fact that he was more relatable.
This brings us to Siri, who has much of the same smart-alecky nature of Donald, or early Mickey, that endeared them to audiences. but was this personality simply a programmer’s easter egg? Or was it something more calculated and perhaps more central t the initial success of Siri than you might think?
The cult of personality
Apple has long been associated with a personality. The public face of that personality up until his death was Steve Jobs. But beyond just Jobs and his forceful presence, Apple has also long had an association with being the preferred computer of artists and expressive people. The IBM PCs were for the straight-laced business people and Macs were for the free spirited thinkers and creators.
This persona speaks to people and drove Mac sales for a long time before Apple was beating everyone on features and price, the way it does with the MacBook Air today.
If Apple pulled off a trick with Siri, it is the way it imbued it with personality. The slightly snarky, but friendly and helpful, tone that it takes on practically invites you to use it. At first just because you want to hear what it’s going to say, but eventually because it is actually helpful.
If you could have imbued a Mac with a logically responding speaking personality in 1984, it may have sounded a lot more like Siri than just a machine that could speak text.
The most telling thing about Siri, and why Apple most likely didn’t just call it ‘assistant’, is that you find yourself calling it by name, whether you realize it or not. It’s not necessary to do so, but you end up finding it feeling more natural when you do.
Siri isn’t just a feature, it’s a personality that draws you in. Apple found a way to take its cult of personality and distill it into a product born out of acquisition and refinement, a process that Apple does so well. In the process, it imbued it with a sense of humanity that resonates.
The future of Siri
As Apple sells millions of iPhone 4S devices over the holiday shopping season, Siri will get a workout. In this initial phase, Apple will be gathering information about how people use it and it will use that data to refine how Siri interacts in order to make it more natural and more effective (we won’t get into the other ways that Apple might use that data here, but you can bet they have other plans too).
As Siri gets more efficient, Apple will expand its capabilities, but probably not by offering access to developers with an API, at least not any time soon. Instead it is more likely that we will see Apple announcing deals with more information providers that will give Siri additional pools of data from which to give answers. Imagine a deal with ESPN for sports scoring or fantasy points, or a tie in with Fandango so that Siri can read you off the movie times at a requested theater.
Beyond more licensing deals for its brain, Apple may begin adding hooks that it can use to pull information store inside the iCloud data uploaded by an app, grabbing your list of action items from Omnifocus or having it read recent messages from an IM client. This would allow you to pull information from inside an app easily without having to clutter up your springboard or notification center with ugly and jarring widgets.
Beyond that I believe that we will see more direct app integration, most likely with the use of an app’s name (look for the addition of a phonetic name field in app submissions) and then a command. Perhaps, “EPSN Fantasy Football, what is my fantasy score?” or “OpenTable, book me a table at Chez Quis.”
Whatever path Siri takes in the future, whether Apple keeps it close to the vest, or allows developers access to its features sooner, rather than later, you can bet that one thing that will remain a constant is that Siri will get better and better at interacting with us on a human, rather than a machine, level.
Siri may never achieve the notoriety of a character like Mickey or Donald, especially as it has no visual component. Soon, we won’t think that it’s strange that our phone or computer has a live-in personality and this may very well change interact with all machines everywhere.