Rollout Sampling Policy Iteration for Decentralized POMDPs

Feng Wu; Shlomo Zilberstein; Xiaoping Chen

Rollout Sampling Policy Iteration for Decentralized POMDPs

Feng Wu, Shlomo Zilberstein, Xiaoping Chen

We present decentralized rollout sampling policy iteration (DecRSPI) -- a new algorithm for multiagent decision problems formalized as DECPOMDPs. DecRSPI is designed to improve scalability and tackle problems that lack an explicit model. The algorithm uses Monte-Carlo methods to generate a sample of reachable belief states. Then it computes a joint policy for each belief state based on the rollout estimations. A new policy representation allows us to represent solutions compactly. The key benefits of the algorithm are its linear time complexity over the number of agents, its bounded memory usage and good solution quality. It can solve larger problems that are intractable for existing planning algorithms. Experimental results confirm the effectiveness and scalability of the approach.

» Read on

Feng Wu, Shlomo Zilberstein, Xiaoping Chen. Rollout Sampling Policy Iteration for Decentralized POMDPs. In Proceedings of the 26th Conference on Uncertainty in Artificial Intelligence (UAI), pages 666-673, Catalina, United States, July 2010.

Save as file

@inproceedings{WZCuai10,
 address = {Catalina, United States},
 author = {Feng Wu and Shlomo Zilberstein and Xiaoping Chen},
 booktitle = {Proceedings of the 26th Conference on Uncertainty in Artificial Intelligence (UAI)},
 month = {July},
 pages = {666-673},
 title = {Rollout Sampling Policy Iteration for Decentralized POMDPs},
 year = {2010}
}

Google Scholar — Cited by 23
Crossref
Engineering Village — Accession Number: 20113914379576
Web of Science

Rollout Sampling Policy Iteration for Decentralized POMDPs

Feng Wu, Shlomo Zilberstein, Xiaoping Chen

Abstract

Citation

BibTex

External Links