The LP/POMDP Marriage: Optimization with Imperfect Information
Abstract
A new technique for solving large-scale allocation problems with partially observable states and constrained action and observation resources is introduced. The technique uses a master linear program (LP) to determine allocations among a set of control policies, and uses partially observable Markov decision processes (POMDPs) to determine improving policies using dual prices from the master L.P. An application is made to a military problem where aircraft attack targets in a sequence of stages, with information acquired in one stage being used to plan attacks in the next.
Description
Naval Research Logistics, 47(8), 2000, pp. 607-619.