Efficient algorithms for budget-constrained Markov decision processes
Dimitrov, Nedialko B. Dimitrov
Morton, David P.
MetadataShow full item record
Discounted, discrete-time, discrete state-space, discrete action-space Markov decision rocesses (MDPs) form a classical topic in control, game theory, and learning, and as a result are widely applied, increasingly, in very large-scale applications. Many algorithms have been developed to solve large-scale MDPs. Algorithms based on value iteration are particularly popular, as they are more efficient than the generic linear programming approach, by an order of magnitude in the number of states of the MDP. Yet in the case of budget constrained MDPs, no more efficient algorithm than linear programming is known. The theoretically slower running times of linear programming may limit the scalability of constrained MDPs piratically; while, theoretically, it invites the question of whether the increase is somehow intrinsic. In this paper we show that it is not, and provide two algorithms for budget constrained MDPs that are as efficient as value iteration. Denoting the running time of value iteration by VI, and the magnitude of the input by U, for an MDP with mexpected budget constraints our first algorithm runs in time O(poly(m; log U) VI). Given a pre-specified degree of precision, for satisfying the budget constraints, our second algorithm runs in time O(logm poly(log U) 1 2 VI), but may produce solutions that overutilize each of the m budgets by a multiplicative factor of 1 + . In fact, one can substitute value iteration with any algorithm, possibly specially designed for a specific MDP, that solves the MDP quickly to achieve similar theoretical guarantees. Both algorithms restrict attention to constrained infinite-horizon MDPs under discounted costs.
Showing items related by title, author, creator and subject.
A New Approach to Governments' Vendor Selection Decisions: A Three-Stage, Multiattribute Procurement Auction Simon, Jay; Melese, Francois (Monterey, California. Naval Postgraduate School, 2010-09-30); NPS-AM-10-176The current fiscal crisis has placed unprecedented pressure on public procurements. A major target of future public spending cuts is likely to be defense expenditures. Within the defense budget, the biggest and most immediate ...
Ozkan, Ugur (Monterey, California. Naval Postgraduate School, 1990);Several heuristics and algorithms have been developed to find minimal sum-of-products expressions in binary logic. Most of them use prime implicants during minimization process. An efficient search strategy has been developed ...
Israeli, E.; Wood, R.K. (2002);We study the problem of interdicting the arcs in a net- work in order to maximize the shortest s–t path length. “Interdiction” is an attack on an arc that destroys the arc or increases its effective length; there is a ...