Scalable, MDP-based planning for multiple, cooperating agents

Joshua D. Redding*, N. Kemal Ure, Jonathan P. How, Matthew A. Vavrina, John Vian

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

10 Citations (Scopus)

Abstract

This paper introduces an approximation algorithm for stochastic multi-agent planning based on Markov decision processes (MDPs). Specifically, we focus on a decentralized approach for planning the actions of a team of cooperating agents with uncertainties in fuel consumption and health-related models. The core idea behind the algorithm presented in this paper is to allow each agent to approximate the representation of its teammates. Each agent therefore maintains its own planner that fully enumerates its local states and actions while approximating those of its teammates. In prior work, the authors approximated each teammate individually, which resulted in a large reduction of the planning space, but remained exponential (in n 1 rather than in n, where n is the number of agents) in computational scalability. This paper extends the approach and presents a new approximation that aggregates all teammates into a single, abstracted entity. Under the persistent search & track mission scenario with 3 agents, we show that while resulting performance is decreased nearly 20% compared with the centralized optimal solution, the problem size becomes linear in n, a very attractive feature when planning online for large multi-agent teams.

Original languageEnglish
Title of host publication2012 American Control Conference, ACC 2012
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages6011-6016
Number of pages6
ISBN (Print)9781457710957
DOIs
Publication statusPublished - 2012
Externally publishedYes
Event2012 American Control Conference, ACC 2012 - Montreal, QC, Canada
Duration: 27 Jun 201229 Jun 2012

Publication series

NameProceedings of the American Control Conference
ISSN (Print)0743-1619

Conference

Conference2012 American Control Conference, ACC 2012
Country/TerritoryCanada
CityMontreal, QC
Period27/06/1229/06/12

Fingerprint

Dive into the research topics of 'Scalable, MDP-based planning for multiple, cooperating agents'. Together they form a unique fingerprint.

Cite this