Easy affine Markov decision processes
Operations Research, vol.
This paper characterizes the class of decomposable affine Markov decision processes (MDPs) which have continuous multidimensional endogenous states and actions, and Markov-modulated exogenous states. This class of MDPs has affine dynamics and single-period rewards, sets of feasible actions that decompose into bounded polytopes, and endogenous state variables that are non-negative or non-positive. It is shown that decomposable affine MDPs with discounted criteria have an affine value function and an affine optimal policy. The affine coefficients of the value function and optimal policy are determined by the solution of auxiliary equations, which themselves resemble the dynamic program of a finite MDP. This result exorcizes the curse of dimensionality for decomposable affine MDPs which otherwise could be solved only approximately with discrete approximations. Also, the paper characterizes partially decomposable affine MDPs that meet only some of the assumptions for decomposable affine MDPs. It shows that they are composites of two smaller MDPs, one of which is a decomposable affine MDP. The applicability of the classes of MDPs in the paper is exemplified with models of fishery management, dynamic capacity portfolio management, and commodity procurement.