An AI agent developed at the Georgia Institute of Technology automatically generates natural language explanations in real-time to explain the motivations behind its actions, ideally allowing those who aren’t experts in the field to interact with AI tools more confidently.
The project was spearheaded by Upol Ehsan, a PhD candidate in the School of Interactive Computing at Georgia Tech.
“There is almost nothing artificial about artificial intelligence,” Ehsan wrote in a blog post dedicated to the new tech. “It is designed by humans for humans, from training to testing to usage. Therefore, it’s essential that AI systems are human-understandable. Sadly, as AI-powered systems get more complex, their decision-making tends to be more ‘black-boxed’ to the end-user, which inhibits trust.”
Ehsan and his colleagues, alongside researchers from Cornell University and the University of Kentucky, designed an AI agent that could play the classic arcade game Frogger and generate on-screen explanations to justify its actions in the game. The goal of Frogger is to get a cartoon frog home safely without being hit by vehicles or drowned in a river.
A group of study participants spectated as the AI played the game and were asked to rate and rank three on-screen rationales for each of the AI’s moves. One explanation was written by a human, one was AI-generated and one was generated randomly; all were judged based on confidence, human-likeness, adequate justification for the action and their understandability.
Ehsan and his team reported that while human-generated responses still took the cake as the most preferred by participants, AI-generated explanations were a close second. AI-generated rationales were ranked higher by participants when they demonstrated recognition of environmental conditions and adaptability and when they communicated awareness of upcoming dangers and planned ahead for them. Responses that were redundant or stated the obvious were ranked lowest.
A follow-up study took humans out of the equation, asking participants to rank a set of AI-generated responses by which they preferred in a scenario where the AI made a mistake or behaved unexpectedly. Rationales were either concise and targeted or holistic and focused more on the context of the game.
Participants favored answers that were holistic by a 3-to-1 margin, suggesting they appreciated the AI thinking about future steps rather than making decisions in the moment.
“This project provided a foundational understanding of AI agents that can mimic thinking out loud,” Ehsan wrote. “Possible future directions include understanding what happens when humans can contest an AI-generated explanation. Researchers will also look at how agents might respond in different scenarios, such as during an emergency response or when aiding teachers in the classroom.”
Ehsan et al.’s work was presented at the Association for Computing Machinery’s Intelligent User Interfaces 2019 Conference.