‘Alexa, let’s chat for 20 minutes like normal human beings’

NaturalIn August 2016, Amazon challenged the world’s artificial intelligence research community to help them make Alexa, its cloud-based, intelligent personal assistant system, able to hold a natural, 20-minute  conversation with Amazon customers.

A group of PhD students from Heriot-Watt University became the only UK team to make it to the finals with their What’s up Bot project. Team leader Ioannis Papaioannou describes the technology, challenges, and breakthroughs in AI that the team made.

When Amazon announced the Alexa Prize 2017, we saw it as a challenge to create a humorous and social chatbot that could hold the type of conversation you would have with someone you just met in the pub: a mixture of current affairs-related chat, finding out about one another and sharing amusing facts, jokes and stories.

Alexa has entered households and workplaces around the world, giving Amazon unprecedented access to a customer base that is already engaging with AI, seeing its benefits and noting its limitations.

Alexa is the voice service that powers Amazon Echo. You can ask Alexa to play a different song, whether it will rain and to order more washing powder. This is the current state-of-play with these AI systems: they can respond to short, task-oriented dialogues, but can’t have longer, free-form conversations like humans do.

We designed our system Alana, therefore, to be able to engage in open-domain, topic-based conversations to minimise responses like ‘I don’t know what you mean’ or “I can’t answer that”; to give natural, non-repetitive replies and, unlike some human companions, to give replies that are engaging, informative and stimulate further conversation.

We had just under 12 months to develop Alana and give her the best possible chat.

Why Conversational AI?

You may well be wondering what the point of the Amazon Alexa Prize is. Alexa does her job well at the moment, as millions of people around the world will attest. It’s an AI system that can change our playlists, tell us whether we need to take an umbrella and order goods to our homes without us having to touch a button. Do we need to be able to chat with her too? Is there any point to advancing conversational AI?

Yes. Conversation is the most widespread and natural means of communication for people. If we can build machines that understand human language and can collaborate with us in conversations, the we make interaction with computers, robots and AI much easier and more natural for everyone.

For people with disabilities, who may not be able to use a keyboard or mouse, there could be huge benefits. But for anyone whose eyes or hands are busy, more advanced conversational AI can have a huge impact: people driving, performing surgery, cooking or even holding a baby. Making IT more accessible and less of a struggle will benefit all of us.

Alana: How we built a social bot

When users speak to the Alexa device, our system extracts keywords in order to classify the user’s intent – what are they asking Alexa? Other functions at this point capture the full text of what the user has said, and metadata like a timestamp and a confidence score, meaning how sure the Speech Recognizer is on what the user have said. All this information is then sent to what we call the “Bucket”.

The Bucket runs the main logic of Alana, and is the cornerstone of our system. In there, several algorithms are extracting useful information from the user’s utterance and transforming it to a complete coherent utterance. For instance, anaphora resolution (resolving pronouns into nouns), or if the user responded with a simple “yes/no” phrase it is transformed into a full meaningful sentence. All these help Alana to better keep the context of the conversation and provide a more coherent response.

The preprocessed utterance is then being forwarded to an ensemble of bots. Each bot returns one or several proposed response(s) back to the Bucket, and the Bucket then selects one of them for output, once it has gone through our ranking system.

We experimented with a range of bots that would send information and potential responses back to the Bucket. Persona, a rule-based system, ensures Alana’s personality is consistent across questions, such as music tastes or other preferences. Newsbot provided summaries of the news, ensuring Alana was up to date with current events. Factbot gave Alana a collection of fun facts, jokes and stories that she could relate to an ongoing conversation, Wikibot was retrieving information on several mentioned entities from Wikipedia.

The entire system was designed to make sure Alana’s responses would be timely, engaging and topical.

Conversational AI will change how humans interact with technology, making it more natural and intuitive. Through the Amazon Alexa Prize 2017, we’ve proved that Heriot-Watt University and Scotland are leaders in this exciting new field.

The politics of socialbot building 

Our team comprised six PhD students from Heriot-Watt’s Interaction Lab, plus our supervisors, professors Verena Rieser and Oliver Lemon. Once the team was set, we submitted our initial proposal to Amazon and were told we had gotten into the first round. All we then had to do was build Alana and prepare for victory.

We received $100,000 in funding from Amazon, plus software and hardware to support our AI development, then we were invited to Seattle to meet the Amazon team and speak with them face-to-face about advancements in AI, their vision for future devices, and how they want customers to interact with them.

From July to August 2017, Alexa customers simply had to say “Alexa, let’s chat” to any Alexa powered device to put our AI systems to the test. Competing socialbots were anonymously powered up to respond to this phrase, and at the end of the conversation, customers provided a rating of 1 to 5 on how they felt about speaking with that socialbot again. The top two scored social bots would go through to the final, along with a wildcard finalist handpicked by the Amazon team.

The semi-finals did not begin well. We had negative feedback around Alana’s politics: despite the fact she was just reporting news headlines, she was accused of being anti-Trump. Our AI system was accused of being Fake News and this had to be tackled.

Other glitches and unexpected bugs were causing complaints. We had built in a functionality to prevent Alana from using profanities she found on the internet, but soon discovered that just because a sentence doesn’t contain a profanity doesn’t mean it isn’t offensive. This required algorithmic finesse.

Throughout the semi-finals we worked on a daily basis: correcting bugs, brainstorming ways to improve our system and coming up with new methods to make sure Alana was giving customers what they wanted, not just what we had envisioned in the lab.

It worked. We started climbing the leaderboard and, in August 2017, were handpicked by Amazon as the wildcard finalist. We were one of three universities worldwide to make it to the finals, and the $500,000 prize was suddenly and unexpectedly within reach.

What’s next for Heriot-Watt University and Alana

We came in third place overall for the Amazon Alexa Prize 2017, with Alana managing to hold a conversation with Alexa customers about popular topics and news events such as entertainment, fashion, politics, sports, and technology. Our team split the $50,000 prize.

The impact of the competition on the Interaction Lab at Heriot-Watt has been transformative. We created the university’s first socialbot and proved it was one of the top three in the world. We each have new data for our PhDs and co-authored a technical paper on the challenge.

One of the most exciting aspects is that we are now going to form a spin out company, based on the Alana prototype but improving the system and working with industry to apply it to commercial activity. We have already had interest from several companies, like video game companies for example. We are still some time from commercialisation, but the scope is enormous.

Conversational AI will change how humans interact with technology, making it more natural and intuitive. Through the Amazon Alexa Prize 2017, we’ve proved that Heriot-Watt University and Scotland are leaders in this exciting new field.