An autonomous agent is an intelligent system that has an ongoing interaction with a dynamic external world. It can perceive and act on the world through a set of limited sensors and effectors. Its most important characteristic is that it is forced to make decisions sequentially, one after another, during its entire “life”. The main objective of this dissertation is to study algorithms by which an autonomous agents can learn, using their own experience, to perform sequential decision-making efficiently and autonomously. The dissertation describes a framework for studying autonomous sequential decision-making consisting of three main elements: the agent, the environment, and the task. The agent attempts to control the environment by perceiving the environment and choosing actions in a sequential fashion. The environment is a dynamic system characterized by a state and its dynamics, a function that describes the evolution of the state given the agent’s actions. A task is a declarative description of the desired behavior the agent should exhibit as it interacts with the environment. The ultimate goal of the agent is to learn a policy or strategy for selecting actions that maximizes its expected benefit as defined by the task.
The dissertation focuses on sequential decision-making when the environment is characterized by continuous states and actions, and the agent has imperfect perception, incomplete knowledge, and limited computational resources. The main characteristic of the approach proposed in this dissertation is that the agent uses its previous experiences to improve estimates of the long-term benefit associated with the execution of specific actions. The agent uses these estimates to evaluate how desirable is to execute alternative actions and select the one that best balances the short- and long-term consequences, taking special consideration of the expected benefit associated with actions that accomplish new learning while making progress on the task.
The approach is based on novel methods that are specifically designed to address the problems associated with continuous domains, imperfect perception, incomplete knowledge, and limited computational resources. The approach is implemented using case-based techniques and extensively evaluated in simulated and real systems including autonomous mobile robots, pendulum swinging and balancing controllers, and other non-linear dynamic system controllers.
Read the thesis: