Reduction of a POMDP to an MDP

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

Abstract

A partially observable Markov decision process (POMDP) is an appropriate mathematical modeling tool for dynamic stochastic systems where portions or all of the system states are not completely observable to the decision maker. In this respect, POMDPs generalize completely observable Markov decision processes (MDPs) by allowing infinitely many states to address partial observability. However, the resulting models suffer tremendously from computational intractability even for relatively small problems. Therefore, POMDPs are frequently approximated by solving variants of completely observable MDPs defined on a grid of finite states. This article summarizes the relationships between completely and partially observable MDPs and derives inequalities for the POMDP value function using the optimal value function of the grid-based MDPs.

Original languageEnglish
Title of host publicationWiley Encyclopedia of Operations Research and Management Science
Publisherwiley
Pages1-9
Number of pages9
ISBN (Electronic)9780470400531
ISBN (Print)9780470400630
DOIs
Publication statusPublished - 1 Jan 2010
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2010 John Wiley & Sons, Inc. All rights reserved.

Keywords

  • grid-based approximation
  • linear programming
  • MDP
  • POMDP

Fingerprint

Dive into the research topics of 'Reduction of a POMDP to an MDP'. Together they form a unique fingerprint.

Cite this