Tip:
Highlight text to annotate it
X
The answer is the MDPs are designed to do stochastic control.
POMDPs are designed to deal with partial observability.
Reinforcement learning deals with an unknown environment,
and the heuristic function and A search and Monte Carlo techniques
are used to deal with computational limitations.
Monte Carlo techniques gives us an approximation.
The heuristic function, if we use the right one, still gives us the right answer,
but deals with the computational complexity.
We don't as yet have any technology that's specifically designed to deal with adversaries.