TY - GEN
T1 - Scalable, MDP-based planning for multiple, cooperating agents
AU - Redding, Joshua D.
AU - Ure, N. Kemal
AU - How, Jonathan P.
AU - Vavrina, Matthew A.
AU - Vian, John
PY - 2012
Y1 - 2012
N2 - This paper introduces an approximation algorithm for stochastic multi-agent planning based on Markov decision processes (MDPs). Specifically, we focus on a decentralized approach for planning the actions of a team of cooperating agents with uncertainties in fuel consumption and health-related models. The core idea behind the algorithm presented in this paper is to allow each agent to approximate the representation of its teammates. Each agent therefore maintains its own planner that fully enumerates its local states and actions while approximating those of its teammates. In prior work, the authors approximated each teammate individually, which resulted in a large reduction of the planning space, but remained exponential (in n 1 rather than in n, where n is the number of agents) in computational scalability. This paper extends the approach and presents a new approximation that aggregates all teammates into a single, abstracted entity. Under the persistent search & track mission scenario with 3 agents, we show that while resulting performance is decreased nearly 20% compared with the centralized optimal solution, the problem size becomes linear in n, a very attractive feature when planning online for large multi-agent teams.
AB - This paper introduces an approximation algorithm for stochastic multi-agent planning based on Markov decision processes (MDPs). Specifically, we focus on a decentralized approach for planning the actions of a team of cooperating agents with uncertainties in fuel consumption and health-related models. The core idea behind the algorithm presented in this paper is to allow each agent to approximate the representation of its teammates. Each agent therefore maintains its own planner that fully enumerates its local states and actions while approximating those of its teammates. In prior work, the authors approximated each teammate individually, which resulted in a large reduction of the planning space, but remained exponential (in n 1 rather than in n, where n is the number of agents) in computational scalability. This paper extends the approach and presents a new approximation that aggregates all teammates into a single, abstracted entity. Under the persistent search & track mission scenario with 3 agents, we show that while resulting performance is decreased nearly 20% compared with the centralized optimal solution, the problem size becomes linear in n, a very attractive feature when planning online for large multi-agent teams.
UR - http://www.scopus.com/inward/record.url?scp=84869389036&partnerID=8YFLogxK
U2 - 10.1109/acc.2012.6315482
DO - 10.1109/acc.2012.6315482
M3 - Conference contribution
AN - SCOPUS:84869389036
SN - 9781457710957
T3 - Proceedings of the American Control Conference
SP - 6011
EP - 6016
BT - 2012 American Control Conference, ACC 2012
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2012 American Control Conference, ACC 2012
Y2 - 27 June 2012 through 29 June 2012
ER -