Authors: Kartikeya Puranam; Michael Katehakis
Addresses: Department of Business Systems and Analytics, La Salle University, 1900 W Olney Avenue, Philadelphia, PA 19141, USA ' Department of Management Science and Information Systems, Rutgers Business School, Newark and New Brunswick, 1 Washington Park, Newark, NJ 07102, USA
Abstract: The standard approach when using a Markov decision process to find an optimal policy is to assume a fixed profit or cost structure. However, in many applied problems it may be not possible to determine profits or costs associated with all states or actions. In such cases we propose the use of taboo first passage reward and taboo first passage time as objectives. In this paper, we investigate problems related to optimising aforementioned taboo measures and we provide two examples.
Keywords: Markov decision processes; taboo probability; reliability; decision analysis; decision science; dynamic programming; taboo reward; inventory; optimisation.
International Journal of Applied Decision Sciences, 2014 Vol.7 No.1, pp.33 - 43
Available online: 14 Nov 2013 *Full-text access for editors Access for subscribers Purchase this article Comment on this article