Özet
A partially observable Markov decision process (POMDP) is an appropriate mathematical modeling tool for dynamic stochastic systems where portions or all of the system states are not completely observable to the decision maker. In this respect, POMDPs generalize completely observable Markov decision processes (MDPs) by allowing infinitely many states to address partial observability. However, the resulting models suffer tremendously from computational intractability even for relatively small problems. Therefore, POMDPs are frequently approximated by solving variants of completely observable MDPs defined on a grid of finite states. This article summarizes the relationships between completely and partially observable MDPs and derives inequalities for the POMDP value function using the optimal value function of the grid-based MDPs.
| Orijinal dil | İngilizce |
|---|---|
| Ana bilgisayar yayını başlığı | Wiley Encyclopedia of Operations Research and Management Science |
| Yayınlayan | wiley |
| Sayfalar | 1-9 |
| Sayfa sayısı | 9 |
| ISBN (Elektronik) | 9780470400531 |
| ISBN (Basılı) | 9780470400630 |
| DOI'lar | |
| Yayın durumu | Yayınlandı - 1 Oca 2010 |
| Harici olarak yayınlandı | Evet |
Bibliyografik not
Publisher Copyright:© 2010 John Wiley & Sons, Inc. All rights reserved.
Parmak izi
Reduction of a POMDP to an MDP' araştırma başlıklarına git. Birlikte benzersiz bir parmak izi oluştururlar.Alıntı Yap
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver