Decision Epoch

From GM-RKB
Jump to navigation Jump to search


References

2011

2010

  • (Alagoz et al., 2010) ⇒ Alagoz, O., Hsu, H., Schaefer, A. J., & Roberts, M. S. (2010). Markov decision processes: a tool for sequential decision making under uncertainty. Medical Decision Making, 30(4), 474-483. doi: 10.1177/0272989X09353194
    • QUOTE: The basic definition of a discrete-time MDP contains 5 components, described using a standard notation.[1] For comparison, Table 1 lists the components of an MDP and provides the corresponding structure in a standard Markov process model. [math]\displaystyle{ T = 1,\cdots, N }[/math] are the decision epochs, the set of points in time at which decisions are made (such as days or hours); [math]\displaystyle{ S }[/math] is the state space, the set of all possible values of dynamic information relevant to the decision process; for any state [math]\displaystyle{ s \in S,\; A_s }[/math] is the action space, the set of possible actions that the decision maker can take at state [math]\displaystyle{ s; p_t(.|s,a) }[/math] are the transition probabilities, the probabilities that determine the state of the system in the next decision epoch, which are conditional on the state and action at the current decision epoch; and [math]\displaystyle{ r_t(s,a) }[/math] is the reward function, the immediate result of taking action a at state [math]\displaystyle{ s.\;(T, S, A_s, p_t(.|s,a), r_t(s,a)) }[/math] collectively define an MDP.

  1. Sandikci B, Maillart LM, Schaefer AJ, Alagoz O, Roberts MS. Estimating the patient's price of privacy in liver transplantation. Oper Res. 2008;56(6):1393–410.