Set
• State: the set of states S
• Action: the set of actions A(s) is associated for state s ∈ S.
• Reward: the set of rewards R(s, a).
Probability distribution
• State transition probability
at state s, taking action a, the probability to
transit s to state s' :
• Reward probability:
at state s, taking action a, the probability to get reward r :
Policy
at state s, the probability to choose action a is