Memory-bounded techniques have shown great promise in solving complex multi-agent planning problems modeled as DEC-POMDPs. Much of the performance gains can be attributed to pruning techniques that alleviate the complexity of the exhaustive backup step of the original MBDP algorithm. Despite these improvements, state-of-the-art algorithms can still handle a relative small pool of candidate policies, which limits the quality of the solution in some benchmark problems. We present a new algorithm, Point-Based Policy Generation, which avoids altogether searching the entire joint policy space. The key observation is that the best joint policy for each reachable belief state can be constructed directly, instead of producing first a large set of candidates. We also provide an efficient approximate implementation of this operation. The experimental results show that our solution technique improves the performance significantly in terms of both runtime and solution quality.
» Read on@inproceedings{WZCaamas10,
address = {Toronto, Canada},
author = {Feng Wu and Shlomo Zilberstein and Xiaoping Chen},
booktitle = {Proceedings of the 9th International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS)},
month = {May},
pages = {1307-1314},
title = {Point-Based Policy Generation for Decentralized POMDPs},
year = {2010}
}