Posts Tagged ‘multistrategy learning’

An Inside Look at the 2017 Alexa Prize Finals

The Alexa Prize is an annual university competition to advance the state of Conversational AI. Last November, Rohit Prasad, vice president and head scientist, Alexa Machine Learning, and I had the pleasure of announcing the winner of the inaugural competition.

Judging a conversational AI competition is super hard because conversation is inherently subjective; there isn’t a clear right or wrong response at each turn in a dialog, nor a precise definition of what makes a conversation “coherent” or “engaging”. This short film explains how the finals were conducted, and showcases the winning teams from University of Washington, Czech Technical University, and Heriot-Watt University.

See what the best conversational socialbots can do today, and what more needs to be done to solve this very hard problem.


VIEW THE FILM:

youtu.be/WTGuOg7GXYU


SPEAK WITH THE SOCIALBOTS:
Just say “Alexa, let’s chat” to your Alexa-enabled device.

Announcing Winners of 2017 Alexa Prize

Earlier today, Rohit Prasad, vice president and head scientist, Alexa Machine Learning, and I had the pleasure of announcing the winner of the inaugural Alexa Prize competition for university students dedicated to accelerating the field of conversational artificial intelligence (AI).

Congratulations to team Sounding Board, an inspiring group of students from the University of Washington, whose socialbot earned an average score of 3.17 on a 5-point scale from our panel of independent judges and achieved an average conversation duration of 10:22. As the winner of our inaugural competition, team Sounding Board earned our $500,000 first-place prize, which will be shared among the students.

We also had the privilege of honoring and surprising our other finalists on stage. Our runner up was team Alquist from Czech Technical University in Prague. We presented them with a $100,000 prize for their efforts. We also awarded our third-place winner, team What’s Up Bot from Heriot-Watt University in Edinburgh, Scotland, with a $50,000 prize.

CHAT WITH THE WINNERS:

Just say “Alexa, let’s chat” to any Alexa-enabled device. (If you’re outside the U.S., set your Amazon Preferred Marketplace (PFM) to U.S. or use a U.S. based Amazon account.)


VIEW THE KEYNOTE:

youtu.be/HXtjdXjpJwI?t=32m18s


VIEW A SHORT FILM SHOWCASING THE WINNING SOCIALBOTS AND HOW THEY WERE SELECTED:

youtu.be/WTGuOg7GXYU


READ MORE:

developer.amazon.com/blogs/alexa/post/1a6a19d8-e45d-4b3b-981d-776a378ba625/university-of-washington-students-win-inaugural-alexa-prize

Conversational AI: The Science behind the Alexa Prize

Conversational agents are exploding in popularity. However, much work remains in the area of social conversation as well as free-form conversation over a broad range of domains and topics. To advance the state of the art in conversational AI, Amazon launched the Alexa Prize, a 2.5-million-dollar university competition where sixteen selected university teams were challenged to build conversational agents, known as “socialbots”, to converse coherently and engagingly with humans on popular topics such as Sports, Politics, Entertainment, Fashion and Technology for 20 minutes.

The Alexa Prize offered the academic community a unique opportunity to perform research with a live system used by millions of users. The competition provided university teams with real user conversational data at scale, along with the user-provided ratings and feedback augmented with annotations by the Alexa team. This enabled teams to effectively iterate and make improvements throughout the competition while being evaluated in real-time through live user interactions.

To build their socialbots, university teams combined state-of-the-art techniques with novel strategies in the areas of Natural Language Understanding, Context Modeling, Dialog Management, Response Generation, and Knowledge Acquisition. To support the teams’ efforts, the Alexa Prize team made significant scientific and engineering investments to build and improve Conversational Speech Recognition, Topic Tracking, Dialog Evaluation, Voice User Experience, and tools for traffic management and scalability.

This paper outlines the advances created by the university teams as well as the Alexa Prize team to achieve the common goal of solving the problem of Conversational AI.

Conversational AI: The Science behind the Alexa Prize

by Ashwin Ram, Rohit Prasad, Chandra Khatri, Anu Venkatesh, Raefer Gabriel, Qing Liu, Jeff Nunn, Behnam Hedayatnia, Ming Cheng, Ashish Nagar, Eric King, Kate Bland, Amanda Wartick, Yi Pan, Han Song, Sk Jayadevan, Gene Hwang, Art Pettigrue

Proceedings of the 2017 Alexa Prize
Invited talk at NIPS-2017 Workshop on Conversational AI
Invited talk at re:Invent 2017 (with Spyros Matsoukas)

READ THE PAPER:

arxiv.org/abs/1801.03604

WATCH THE TALK:

youtu.be/pn5QJQZjGpM

			

User-Generated AI for Interactive Digital Entertainment

CMU Seminar

User-generated content is everywhere: photos, videos, news, blogs, art, music, and every other type of digital media on the Social Web. Games are no exception. From strategy games to immersive virtual worlds, game players are increasingly engaged in creating and sharing nearly all aspects of the gaming experience: maps, quests, artifacts, avatars, clothing, even games themselves. Yet, there is one aspect of computer games that is not created and shared by game players: the AI. Building sophisticated personalities, behaviors, and strategies requires expertise in both AI and programming, and remains outside the purview of the end user.

To understand why Game AI is hard, we need to understand how it works. AI can take digital entertainment beyond scripted interactions into the arena of truly interactive systems that are responsive, adaptive, and intelligent. I discuss examples of AI techniques for character-level AI (in embedded NPCs, for example) and game-level AI (in the drama manager, for example). These types of AI enhance the player experience in different ways. The techniques are complicated and are usually implemented by expert game designers.

I argue that User-Generated AI is the next big frontier in the rapidly growing Social Gaming area. From Sims to Risk to World of Warcraft, end users want to create, modify, and share not only the appearance but the “minds” of their characters. I present my recent research on intelligent technologies to assist Game AI authors, and show the first Web 2.0 application that allows average users to create AIs and challenge their friends to play them—without programming. I conclude with some thoughts about the future of AI-based Interactive Digital Entertainment.

CMU Robotics & Intelligence Seminar, September 28, 2009
Carnegie-Mellon University, Pittsburgh, PA.
MIT Media Lab Colloquium, January 25, 2010
Massachusetts Institute of Technology, Cambridge, MA.
Stanford Media X Philips Seminar, February 1, 2010
Stanford University, Stanford, CA.
Pixar Research Seminar, February 2, 2010

Try it yourself:
Learn more about the algorithms:
View the talk:
www.sais.se/blog/?p=57

View the slides:

An Ensemble Learning and Problem Solving Architecture for Airspace Management

In this paper we describe the application of a novel learning and problem solving architecture to the domain of airspace management, where multiple requests for the use of airspace need to be reconciled and managed automatically. The key feature of our “Generalized Integrated Learning Architecture” (GILA) is a set of integrated learning and reasoning (ILR) systems coordinated by a central meta-reasoning executive (MRE). Each ILR learns independently from the same training example and contributes to problem-solving in concert with other ILRs as directed by the MRE. Formal evaluations show that our system performs as well as or better than humans after learning from the same training data. Further, GILA outperforms any individual ILR run in isolation, thus demonstrating the power of the ensemble architecture for learning and problem solving.

Read the paper:

An Ensemble Learning and Problem Solving Architecture for Airspace Management

by XS Zhang et al.

International Conference on Innovative Applications of Artificial Intelligence (IAAI-09), Pasadena, CA, July 2009
www.cc.gatech.edu/faculty/ashwin/papers/er-09-03.pdf

Goal-Driven Learning in the GILA Integrated Intelligence Architecture

Goal Driven Learning (GDL) focuses on systems that determine by themselves what has to be learned and how to learn it. Typically GDL systems use meta-reasoning capabilities over a base reasoner, identifying learning goals and devising strategies. In this paper we present a novel GDL technique to deal with complex AI systems where the meta-reasoning module has to analyze the reasoning trace of multiple components with potentially different learning paradigms. Our approach works by distributing the generation of learning strategies among the different modules instead of centralizing it in the meta-reasoner. We implemented our technique in the GILA system, that works in the airspace task orders domain, showing an increase in performance.

Read the paper:

Goal-Driven Learning in the GILA Integrated Intelligence Architecture

by Jai Radhakrishnan, Santi Ontañón, Ashwin Ram

International Joint Conference on Artificial Intelligence (IJCAI-09), Pasadena, CA, July 2009
www.cc.gatech.edu/faculty/ashwin/papers/er-09-02.pdf

Learning and Joint Deliberation through Argumentation in Multi-Agent Systems

We present an argumentation framework for learning agents (AMAL) designed for two purposes: (1) for joint deliberation, and (2) for learning from communication.  The AMAL framework is completely based on learning from examples: the argument preference relation, the argument generation policy, and the counterargument generation policy are case-based techniques.

For joint deliberation, learning agents share their experience by forming a committee to decide upon some joint decision. We experimentally show that the argumentation among committees of agents improves both the individual and joint performance. For learning from communication, an agent engages into arguing with other agents in order to contrast its individual hypotheses and receive counterexamples; the argumentation process improves their learning
scope and individual performance.

Read the paper:

Learning and Joint Deliberation through Argumentation in Multi-Agent Systems

by Santi Ontañón and Enric Plaza

in Autonomous Agents and Multi-Agent Systems (AAMAS 2007), pp. 971-978
www.cc.gatech.edu/faculty/ashwin/papers/er-07-19.pdf

Case-Based Learning from Proactive Communication

We present a proactive communication approach that allows CBR agents to gauge the strengths and weaknesses of other CBR agents. The communication protocol allows CBR agents to learn from communicating with other CBR agents in such a way that each agent is able to retain certain cases provided by other agents that are able to improve their individual performance (without need to disclose all the contents of each case base). The selection and retention of cases is modeled as a case bartering process, where each individual CBR agent autonomously decides which cases offers for bartering and which offered barters accepts. Experimental evaluations show that the sum of all these individual decisions result in a clear improvement in individual CBR agent performance with only a moderate increase of individual case bases.

Read the paper:

Case-Based Learning from Proactive Communication

by Santi Ontañón and Enric Plaza

International Joint Conference on Artificial Intelligence (IJCAI 2007), pp. 999-1004
www.cc.gatech.edu/faculty/ashwin/papers/er-07-18.pdf

Introspective Multistrategy Learning: On the Construction of Learning Strategies

A central problem in multistrategy learning systems is the selection and sequencing of machine learning algorithms for particular situations. This is typically done by the system designer who analyzes the learning task and implements the appropriate algorithm or sequence of algorithms for that task. We propose a solution to this problem which enables an AI system with a library of machine learning algorithms to select and sequence appropriate algorithms autonomously. Furthermore, instead of relying on the system designer or user to provide a learning goal or target concept to the learning system, our method enables the system to determine its learning goals based on analysis of its successes and failures at the performance task.

The method involves three steps: Given a performance failure, the learner examines a trace of its reasoning prior to the failure to diagnose what went wrong (blame assignment); given the resultant explanation of the reasoning failure, the learner posts explicitly represented learning goals to change its background knowledge (deciding what to learn); and given a set of learning goals, the learner uses nonlinear planning techniques to assemble a sequence of machine learning algorithms, represented as planning operators, to achieve the learning goals (learning-strategy construction). In support of these operations, we define the types of reasoning failures, a taxonomy of failure causes, a second-order formalism to represent reasoning traces, a taxonomy of learning goals that specify desired change to the background knowledge of a system, and a declarative task-formalism representation of learning algorithms.

We present the Meta-AQUA system, an implemented multistrategy learner that operates in the domain of story understanding. Extensive empirical evaluations of Meta-AQUA show that it performs significantly better in a deliberative, planful mode than in a reflexive mode in which learning goals are ablated and, furthermore, that the arbitrary ordering of learning algorithms can lead to worse performance than no learning at all. We conclude that explicit representation and sequencing of learning goals is necessary for avoiding negative interactions between learning algorithms that can lead to less effective learning.

Read the paper:

Introspective Multistrategy Learning: On the Construction of Learning Strategies

by Mike Cox, Ashwin Ram

Artificial Intelligence, 112:1-55, 1999
www.cc.gatech.edu/faculty/ashwin/papers/er-99-01.pdf

Experiments with Reinforcement Learning in Problems with Continuous State and Action Spaces

A key element in the solution of reinforcement learning problems is the value function. The purpose of this function is to measure the long-term utility or value of any given state. The function is important because an agent can use this measure to decide what to do next. A common problem in reinforcement learning when applied to systems having continuous states and action spaces is that the value function must operate with a domain consisting of real-valued variables, which means that it should be able to represent the value of infinitely many state and action pairs. For this reason, function approximators are used to represent the value function when a close-form solution of the optimal policy is not available.

In this paper, we extend a previously proposed reinforcement learning algorithm so that it can be used with function approximators that generalize the value of individual experiences across both, state and action spaces. In particular, we discuss the benefits of using sparse coarse-coded function approximators to represent value functions and describe in detail three implementations: CMAC, instance-based, and case-based. Additionally, we discuss how function approximators having different degrees of resolution in different regions of the state and action spaces may influence the performance and learning efficiency of the agent.

We propose a simple and modular technique that can be used to implement function approximators with non-uniform degrees of resolution so that it can represent the value function with higher accuracy in important regions of the state and action spaces. We performed extensive experiments in the double integrator and pendulum swing up systems to demonstrate the proposed ideas.

Read the paper:

Experiments with Reinforcement Learning in Problems with Continuous State and Action Spaces

by Juan Santamaria, Rich Sutton, Ashwin Ram

Adaptive Behavior, 6(2):163-217, 1997
www.cc.gatech.edu/faculty/ashwin/papers/er-98-02.pdf