Halmstad 2006
Dynamic programing and Markov chains

Jean-Philippe CHANCELIER

Version pdf de ce document
Version sans bandeaux

Table des matières

0.1 Optimal stoping time problem: The house or secretary problem

0.1 Optimal stoping time problem: The house or secretary problem

We want to buy a house deciding which one to buy by visiting a fixed sequence of houses. The houses have a value ( $v_1 < v_2 < \ldots < v_N$ ) which are not know in advance. We obtain the value of a current house at the time we visit it and we are then able to compare its value with the previously visited ones. We assume that the probabilistic model is that when visiting a sequence of house we will visit a random permutation of the houses with a uniform law on all the random permutations. When we visit the -th house we can decide to stop and buy the house or to continue it is not possible to come back on a non-selected house. Let defined as follows :

$\displaystyle S_k = \left\{ \begin{array}{l} 1 \quad\mbox{si}\quad W_k =k \qua... ...}_{j\in [1,k]} v_{\sigma(j)} \\ 0 \quad\mbox{sinon}\quad \end{array} \right.$

is a Markov chain with $\{0,1\}$ as state space and transition matrices $M^{(k)}$ are as follows : $M^{(k)}_{i,1} = 1/(k+1)$ and $M^{(k)}_{i,0} = k/(k+1)$ (Note that the

are in fact independent).

Our problem is a stoping time problem. We have to decide at which time to stop in order to maximize the probability that we have chosen the best house. We want to compute and obtain the associated optimal strategy (note that $S_1 \equiv 1$ ), where is given by :

$\displaystyle u_n(x) \equiv \sup_{\tau \;{\cal F}_n t.a., n \leq \tau \leq N} {\mathbb{E}} \left[ g_\tau ( S^\tau) \vert S_n =x \right].$

with

and $g_k(0)\equiv 0$ .

We recall here that is solution of the following recursive equation :

$\displaystyle u_n(x) = \max \left( \frac{1}{ n+1}u_{n+1}(1)+ \frac{n }{ n+1} u_{n+1}(0), \frac{n }{ N} {\mathbb{I}}_{ x \ne 0}\right) \quad; u_N(x)= g_N(x)=x$

(1)

Question 1 Write a program which computed for $n\in[1,N]$ and draw on a graphics the curves , and the two functions and .

Question 2 We assume here that the values . Use grand to obtain a random permutation of $(v_i)_{i \in [1,N]}$ .

Question 3 Compute along the trajectory the value of and compute the stoping time.

Question 4 Use monte Carlo simulation to evaluate the optimal value function.

Halmstad 2006 Dynamic programing and Markov chains

Table des matières

0.1 Optimal stoping time problem: The house or secretary problem

Halmstad 2006
Dynamic programing and Markov chains