Partially Observed Markov Decision Processes, 2 Revised edition
出版済み 3-5週間でお届けいたします。
Title: Partially Observed Markov Decision Processes, 2 Revised edition Subtitle: Filtering, Learning and Controlled Sensing Author: Krishnamurthy, Vikram (Cornell University, New York) Publisher: Cambridge University Press ISBN: 9781009449434 Cover: HARDCOVER Date: 2025年06月 DESCRIPTION POMDPにおける定式化、アルゴリズムおよび構造結果に関するこの調査は、制御センシングにおける基本的概念と実世界の応用への接続に焦点を合わせており、技術機械を最小限に抑えています。新版には、逆強化学習、ノンパラメトリックベイズ推定、変分ベイズおよびコンフォーマル予測が収録されています。 Covering formulation, algorithms and structural results and linking theory to real-world applications in controlled sensing (including social learning, adaptive radars and sequential detection), this book focuses on the conceptual foundations of partially observed Markov decision processes (POMDPs). It emphasizes structural results in stochastic dynamic programming, enabling graduate students and researchers in engineering, operations research, and economics to understand the underlying unifying themes without getting weighed down by mathematical technicalities. In light of major advances in machine learning over the past decade, this edition includes a new Part V on inverse reinforcement learning as well as a new chapter on non-parametric Bayesian inference (for Dirichlet processes and Gaussian processes), variational Bayes and conformal prediction. * Links theory to real-world applications in controlled sensing * Consolidates results from across the literature of multiple different disciplines into a centralized resource * Presents the key ideas underpinning Bayesian filtering, POMDPs, reinforcement learning, and inverse reinforcement learning in an accessible way TABLE OF CONTENTS Preface to revised edition Notation 1. Introduction I. Stochastic Models and Bayesian Filtering: 2. Stochastic state space model 3. Optimal filtering 4. Algorithms for maximum likelihood parameter estimation 5. Multi-agent sensing: social learning and data incest 6. Nonparametric Bayesian inference II. POMDPs: Models and Applications: 7. Fully observed Markov decision processes 8. Partially observed Markov decision processes 9. POMDPs in controlled sensing and sensor scheduling III. POMDP Structural Results: 10. Structural results for Markov decision processes 11. Structural results for optimal filters 12. Monotonicity of value function for POMDPs 13. Structural results for stopping-time POMDPs 14. Stopping-Time POMDPs for quickest detection 15. Myopic policy bounds for POMDPs and sensitivity to model parameters IV. Stochastic Gradient Algorithms and Reinforcement Learning: 16. Stochastic optimization and gradient estimation 17. Reinforcement learning 18. Stochastic gradient algorithms: convergence analysis 19. Discrete stochastic optimization V. Inverse Reinforcement Learning: 20. Revealed preferences for inverse reinforcement learning 21. Bayesian inverse reinforcement learning Appendix A. Short primer on stochastic stimulation Appendix B. Continuous-time HMM filters Appendix C. Discrete-time Martingales Appendix D. Markov processes Appendix E. Some limit theorems in statistics Appendix F. Summary of POMDP algorithms Bibliography Index.
![]()
|
||||||||||||||||||||||||||||||||||||||||||||||||