Belief selection in point-based planning algorithms for POMDPs

Current point-based planning algorithms for solving partially observable Markov decision processes (POMDPs) have demonstrated that a good approximation of the value function can be derived by interpolation from the values of a specially selected set of points. The performance of these algorithms can...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: Azar, Danielle (author)
مؤلفون آخرون: Izadi, Masoumeh T. (author), Precup, Doina (author)
التنسيق: conferenceObject
منشور في: 2017
الوصول للمادة أونلاين:http://hdl.handle.net/10725/5374
http://dx.doi.org/10.1007/11766247_33
http://libraries.lau.edu.lb/research/laur/terms-of-use/articles.php
https://link.springer.com/chapter/10.1007/11766247_33
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
الوصف
الملخص:Current point-based planning algorithms for solving partially observable Markov decision processes (POMDPs) have demonstrated that a good approximation of the value function can be derived by interpolation from the values of a specially selected set of points. The performance of these algorithms can be improved by eliminating unnecessary backups or concentrating on more important points in the belief simplex. We study three methods designed to improve point-based value iteration algorithms. The first two methods are based on reachability analysis on the POMDP belief space. This approach relies on prioritizing the beliefs based on how they are reached from the given initial belief state. The third approach is motivated by the observation that beliefs which are the most overestimated or underestimated have greater influence on the precision of value function than other beliefs. We present an empirical evaluation illustrating how the performance of point-based value iteration (Pineau et al., 2003) varies with these approaches.