Halmstad 2006
Dynamic programing and Markov chains

Jean-Philippe CHANCELIER
Version pdf de ce document
Version sans bandeaux

Table des matières

0.1 Optimal stoping time problem: The house or secretary problem

We want to buy a house deciding which one to buy by visiting a fixed sequence of $ N$ houses. The $ N$ houses have a value $ v_k$ ( $ v_1 < v_2 < \ldots < v_N$ ) which are not know in advance. We obtain the value of a current house at the time we visit it and we are then able to compare its value with the previously visited ones. We assume that the probabilistic model is that when visiting a sequence of house we will visit a random permutation of the $ N$ houses with a uniform law on all the random permutations. When we visit the $ k$-th house we can decide to stop and buy the house or to continue it is not possible to come back on a non-selected house. Let $ S_k$ defined as follows :

$\displaystyle S_k = \left\{ \begin{array}{l} 1 \quad\mbox{si}\quad
W_k =k \qua...
...}_{j\in [1,k]} v_{\sigma(j)} \\
0 \quad\mbox{sinon}\quad
\end{array} \right.
$

$ S_n$ is a Markov chain with $ \{0,1\}$ as state space and transition matrices $ M^{(k)}$ are as follows : $ M^{(k)}_{i,1} = 1/(k+1)$ and $ M^{(k)}_{i,0} = k/(k+1)$ (Note that the $ s_k$ are in fact independent).

Our problem is a stoping time problem. We have to decide at which time to stop in order to maximize the probability that we have chosen the best house. We want to compute $ u_1(1)$ and obtain the associated optimal strategy (note that $ S_1 \equiv 1$), where $ u_n(x)$ is given by :

$\displaystyle u_n(x) \equiv \sup_{\tau \;{\cal F}_n t.a., n \leq \tau \leq N}
{\mathbb{E}} \left[ g_\tau ( S^\tau) \vert S_n =x \right].
$

with $ g_k(1) = k/N$ and $ g_k(0)\equiv 0$.

We recall here that $ u_n(x)$ is solution of the following recursive equation :

$\displaystyle u_n(x) = \max \left( \frac{1}{ n+1}u_{n+1}(1)+ \frac{n }{ n+1} u_{n+1}(0), \frac{n }{ N} {\mathbb{I}}_{ x \ne 0}\right) \quad;  u_N(x)= g_N(x)=x$ (1)

    

Question 1   Write a program which computed $ u_n(x)$ for $ n\in[1,N]$ and draw on a graphics the curves $ u_n(1)$, $ u_n(0)$ and the two functions $ g_n(1)$ and $ g_n(0)$.



    

Question 2   We assume here that the values $ v_i = i$. Use grand to obtain a random permutation of $ (v_i)_{i \in [1,N]}$.



    

Question 3   Compute along the trajectory the value of $ u_n$ and compute the stoping time.



    

Question 4   Use monte Carlo simulation to evaluate the optimal value function.