When we think of Alexa having a personality, we’ll describe it as her responding to certain things in a certain way – what she does or doesn’t do based on a question or request.
That’s because Alexa doesn’t really have a personality. Almost every single AI or voice-based product doesn’t have one. When The Food Network discusses how Rachael Ray’s “personality” will come through the voice of Alexa, what they mean is that they’re adding expressions in her voice.
Expressions are what we’re substituting for personality in robotics and AI – a series of movements, sounds, and displays that they can carry out to convey their actions, intentions, functions, and “emotions,” though that’s a stretch for most of robotics today.
AI development has found a way to trick the user into perceiving personality where it doesn’t exist, adding layers of personal touches without the intricacies that make up our (oftentimes imperfect) personalities. Personality isn’t just a collection of things that you can theoretically do in a situation, but a development of the inputs and outputs in multiple situations, developed with historic and emotional context.
That’s why Aibo’s personalities are so strange — the very idea of being able to cleanly categorize a personality and show you on a screen is only a slightly more advanced version of giving your GPS a different voice. If you’re a clingy person, would you act exactly the same way with exactly the same person every time? I’d doubt it.
What personality looks like
In developing true robot “personality,” you’re effectively dealing with a decision process based on inputs and outputs immediately and over time. Given an input, which output should the robot carry out? What does the input mean, and based on what other inputs the robot has, what action should it carry out as a result?
For example, if the robot hears its name said in an angry voice, its natural reaction could be to be scared, or to potentially seek to calm the person in question. Given a series of inputs and events, what does the robot predict is the reaction to an action it’s made? The robot is seeking a favorable outcome – one where the person in question is happy.
The personality the robot develops is over a series of these reactions – if three out of four people are cheered up when they’re seen as angry by the robot making a joke, then perhaps it will develop a jokier personality to deal with anger. One that never sees a favorable reaction when it seeks anger may have a meeker personality – seeing potential interactions as potentially negative.
The simplification of personality is why you’ll have scenarios with Alexa or Siri where they chime in at the wrong time. They don’t really have personalities – they may have personality-adjacent expressions, but they lack the nuance and processing of a personality to judge tone or emotion, or to learn over time.
In many ways, that’s totally fine for a voice assistant – they’re simple input/output command-based systems that you don’t seek companionship with. They also are very much static in how they process feedback from a user, rarely if ever improving interactions based on whether they did the thing they were meant to do.
To put it simply, robots don’t learn, thus they don’t feel real or lifelike – they lack the ability to synthesize information and then act upon it in the way a human would.
So what can we do?
In the era of AI-facilitated perceptions, the number of inputs a robot can process has drastically increased, meaning that you need to create much larger rulesets either in advance or on the fly based on the inputs you receive. This is an incredible technical challenge for even simple problems like accents, multiple devices, the time of day you may be asking, and so on.
This is where the expansion of robotics and AI has to move into specialist devices — companions, assistants, friends, workmates, and so on. This means that we can build enough personality into those devices that need it, meaning that your Alexa could interpret emotional context and react as necessary (i.e. not giving you a potential output that would annoy you, or learning what you mean beyond your words) without developing a playful or disruptive personality that would become, in time, incredibly annoying.
Conversely, if we have a companion AI, we want one that will also have developed sensory elements to help it understand what the user requires, create rules as necessary, and choose an output that will make the user feel the way the companion wants it to.
That thing the AI is looking for is the key. In the case of Alexa, you’re asking her to perform one thing – play a song, turn off a light, etc. In the case of a companion, you’re not necessarily asking the AI to do something, but the AI is developing a ruleset that reacts and anticipates.
If you come home one day and look sad, the robot may seek to cheer you up. If you come home every day, at the same time, looking sad, the robot will learn a routine – it may learn that the thing you need that time is, indeed, a particular song. It may learn that the best thing it can do is leave you alone – and that may become part of what it does for everyone.
As developers and innovators we’re used to trying to find a way of categorizing anything and everything. The truth is that a personality is not fully categorizable. It’s nuanced, it’s empathetic and it’s ever-changing, just like we are. Understanding that is how we truly create robotic personalities – ones that live and grow like we do.
TNW Conference 2019 is coming! Check out our glorious new location, an inspiring lineup of speakers and activities, and how to be a part of this annual tech bonanza by clicking here.
Published April 22, 2019 — 11:00 UTC