Swarm Intelligence in Cooperative Environments: n-Step Dynamic Tree Search Algorithm Overview

Marc Espinós Longa, Antonios Tsourdos, Gokhan Inalhan

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

Reinforcement learning tree-based planningmethods have been gaining popularity in the last few years due to their success in single-agent domains,where a perfect simulatormodel is available: for example,Go and chess strategic board games. This paper pretends to extend tree search algorithms to the multiagent setting in a decentralized structure, dealing with scalability issues and exponential growth of computational resources. The n-step dynamic tree search combines forward planning and direct temporal-difference updates, outperforming markedly conventional tabular algorithms such asQ learning and state-action-reward-state-action (SARSA). Future state transitions and rewards are predicted with a model built and learned from real interactions between agents and the environment. This paper analyzes the developed algorithmin the hunter–pursuit cooperative game against stochastic and intelligent evaders.The n-step dynamic tree search aims to adapt single-agent tree search learningmethods to themultiagent boundaries and is demonstrated to be a remarkable advance as compared to conventional temporal-difference techniques.

Original languageEnglish
Pages (from-to)418-425
Number of pages8
JournalJournal of Aerospace Information Systems
Volume20
Issue number7
DOIs
Publication statusPublished - Jul 2023
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2023 by the American Institute of Aeronautics and Astronautics, Inc. All rights reserved.

Funding

This research is sponsored by the Engineering and Physical Sciences Research Council and BAE Systems under project reference number 2454254.

FundersFunder number
BAE Systems2454254
Engineering and Physical Sciences Research Council

    Fingerprint

    Dive into the research topics of 'Swarm Intelligence in Cooperative Environments: n-Step Dynamic Tree Search Algorithm Overview'. Together they form a unique fingerprint.

    Cite this