Inventory Control

Michel De Lara
(last modification date: October 10, 2017)
Version pdf de ce document
Version sans bandeaux

1 Expected costs minimization
2 Stochastic viability

Contents

1 Expected costs minimization
2 Stochastic viability

Let time t be measured in discrete units (such as days, weeks or months). Consider the inventory problem

x(t + 1) = x(t) + u(t) − w (t), u (t) ≥ 0
(1)

where

When x(t) < 0, this corresponds to a backlogged demand, supposed to be filled immediately once inventory is again available.

1 Expected costs minimization

(Bertsekas2000)

The manager problem is one of cost minimization where

The costs incurred in period t are

  cu(t)  + bmax {0, − (x (t) + u(t) − w(t))}+ h max {0,x (t) + u(t) − w(t)}.
  ◟◝◜-◞    ◟-------------◝◜--------------◞  ◟------------◝◜------------◞
purchasing              shortage                         holding
(2)

For simplicity, we shall denote

ℛ (x) := bmax {0, − x} + hmax  {0,x}.
(3)

On the period from t0 to T, the costs sum up to

T∑−1
   [cu(t) + ℛ (x(t) + u(t) − w (t))].
t=t
  0
(4)

We shall suppose that unitary costs are ranked as follows:

b > c > 0.

We suppose that w(t), the uncertain demand, is a random variable with distribution p0,...,pN  on the set {0, ...,N } :

ℙ{w (t) = 0} = p0,...,ℙ{w (t) = N } = pN .
(5)

We also suppose that the random variables

w(⋅) = (w(t0),...,w(T −  1))
(6)

are independent.

A decision rule 𝔲 : × 𝕏 𝕌 assigns a (stock) order u = 𝔲(t,x)  to any state x of inventory stock and to any period t. Once given, we obtain random trajectories

   x (t )  =   x
      0        0
x (t + 1)  =   x(t) + u(t) − w(t)
    u(t)  =   𝔲(t,x (t))
and thus random costs
                    T∑−1
Crit𝔲(t0,x0,w (⋅)) :=    cu(t) + ℛ (x(t) + u(t) − w (t)).
                    t=t0
(7)

The expected costs are

                        T∑−1
𝔼[Crit𝔲(t0,x0,w (⋅))] = 𝔼[    cu(t) + ℛ (x(t) + u(t) − w (t))].
                        t=t0
(8)

The dynamic programming equation associated to the problem of minimizing the expected costs is

(
||  V (T,x )  =     0    ,
{               fi◟n◝al◜◞cost
    V (t,x )  =  inf 𝔼[(cu + ℛ (x + u − w) +V (t + 1,x + u − w ))],
||(               u≥0    ◟-------◝◜-------◞           ◟---◝◜---◞
                         instantaneous cost           future stock
(9)

where w is a random variable with the distribution p ,...,p
 0      N  on the set {0,...,N } .

We have that

                                                  ∑N
V (T  − 1,x) = iun≥f0𝔼 [cu + ℛ (x + u − w )] = inuf≥0[cu +     piℛ (x + u − i)].
                                                   i=0
(10)

Question 1 Recalling that b > c > 0  , show that g(z) = 𝔼[cz + ℛ (z − w )]  is a convex function with a minimum achieved at some ST1 ∈{0,,N}. You can draw a graph of the above function for N = 1 and p0 = 12.

Observe that 𝔼[cu + ℛ (x + u − w)] = − cx + g(x + u)  , and deduce that the optimal rule is

              {
                ST −1 − x      if     x < ST −1
𝔲(T − 1, x) =           0      if     x ≥ ST −1.
(11)

Interpret this rule, sometimes called base-stock policy.

Question 2 Show by induction that there exist thresholds St0,...,ST −1   in {0,,N} such that the optimal rule at period t is

         {
            St − x      if      x < St
𝔲(t,x) =         0      if      x ≥ St.
(12)

We know will make numerical simulations, and try different rules.

Question 3 Sample one random sequence of T t0 demands with the macro grand. Picture the trajectories of the stocks corresponding to the constant decision rule 𝔲(t,x ) = 2  . Evaluate the costs as in (4).

  // exec inventory_control.sce
  
  // Costs functions parameters
  purchasing=100;
  shortage=150;
  holding=20;
  
  // Shortage/Holding costs
  function c=SHcosts(zz)
    c=shortage*maxi(-zz,0)+holding*maxi(zz,0);
  endfunction
  
  // Instantaneous costs function
  function c=instant_costs(xx,uu,ww)
    c=purchasing*uu+SHcosts(xx+uu-ww);
  endfunction
  
  // Decision rule
  function u=constant_rule(t,x)
    u=2;
  endfunction
  
  // Trajectories simulations 
  
  function [XX,UU,CC]=trajectories(simulations,rule)
    XX=[];
    UU=[];
    CC=[];
    WW=grand(horizon,'markov',transition,ones(1,simulations));
  
    for s=1:simulations do
      xx=0;
      uu=[];
      cc=0;
      ww=WW(s,:);
      for t=0:(horizon-1) do
        uu=[uu,rule(t,xx($))];
        xx=[xx,xx($)+uu($)-ww(t+1)];
        cc=cc+instant_costs(xx($),uu($),ww(t+1));
      end
      //  plot2d([0:horizon],xx)
      XX=[XX;xx];
      UU=[UU;uu];
      CC=[CC;cc];
    end
    //
    disp('expected costs are '+string(mean(CC)));
  endfunction
  
  horizon=12;
  proba=0.01*[40,20,30,10];
  
  // simulation of iid sequences of length horizon
  transition=ones(proba')*proba;
  
  // Number of Monte Carlo simulations
  simulations=1;
  simulations=100;
  simulations=10;
  
  // Trajectories simulations and visualization
  xset("window",21);xbasc();
  [XX,UU,CC]=trajectories(simulations,constant_rule);
  plot2d(0:horizon,XX')
  
  constant_rule_costs=mean(CC)

Question 4 Draw 100 random sequences of T t0 demands. Picture the trajectories of the stocks corresponding to the constant decision rules 𝔲k(t,x) = k  , for k = 0, 1, 2, 3. What do you observe? Evaluate the expected costs by Monte Carlo. What is the optimal rule among the constant decision rules?

  // number of Monte Carlo simulations
  simulations=1;
  simulations=100;
  
  [XX,UU,CC]=trajectories(simulations,constant_rule);
  plot2d(0:horizon,XX')
  
  constant_rule_costs=mean(CC)

Question 5 Picture the trajectories of the stocks corresponding to the following optimal decision rule. Compare these trajectories with those obtained with the constant decision rules above. What do you observe? Evaluate the expected costs by Monte Carlo. Compare with the expected costs given by the best constant decision rule.

  ////////////////////////////////////////////////////////////////////////
  //    STOCHASTIC DYNAMIC PROGRAMMING EQUATION 
  ////////////////////////////////////////////////////////////////////////
  
  function [FEEDBACK,state_min]=SDP()
    states=[(mini(controls)-maxi(demands))*(horizon): ...
            (maxi(controls)-mini(demands))*(horizon)];
    // Due to bounds on controls and demands, the state is bounded a priori.
  
    cardinal_states=prod(size(states));
    cardinal_controls=prod(size(controls));
    cardinal_demands=prod(size(demands));
  
    state_min=mini(states);
    state_max=maxi(states);
  
    function i=index(x)
      i=x-state_min+1;
    endfunction
  
    function xdot=dynamics(x,u,w)
      xdot=maxi(state_min,mini(state_max,x+u-w));
    endfunction
  
    VALUE=zeros([0:horizon]'*states);
    FEEDBACK=zeros([0:horizon]'*states);
  
    VALUE(horizon+1,:)=zeros(states);
  
    shift=1+[(horizon-1):(-1):1];
    for tt=shift do
      loc=zeros(cardinal_controls,cardinal_states);
      // local variable containg the values of the function to be minimized
      for jj=1:cardinal_controls do
        uu=controls(jj);
        loc(jj,:)=0;
        // the following loop computes an expectation
        for dd=1:cardinal_demands do
          ww=demands(dd);
          loc(jj,:)=loc(jj,:)+ ...
                    proba(dd)* ...
                    (instant_costs(states,uu,ww)+ ...
                     VALUE(tt+1,index(dynamics(states,uu,ww))));
        end;
      end
      //  
      [mmn,jjn]=mini(loc,'r');
      [mmx,jjx]=maxi(loc,'r');
      // mm is the extremum achieved
      // jj is the index of the extremum argument
      //
      VALUE(tt,:)=mmn;
      // minimal cost 
      FEEDBACK(tt,:)=controls(jjn);
      // optimal feedback 
    end
  endfunction
  
  
  horizon=12;
  proba=0.01*[40,20,30,10];
  transition=ones(proba')*proba;
  demands=[0:(prod(size(proba))-1)];
  controls=[0:2*(prod(size(proba)))];
  
  [FEEDBACK,state_min]=SDP();
  
  // Decision rule
  function u=optimal_rule(t,x)
    tt=t+1;
    u=FEEDBACK(tt,x-state_min+1);
  endfunction
  //
  
  
  
  // Trajectories simulations and visualization
  xset("window",22);xbasc();
  [XX,UU,CC]=trajectories(simulations,optimal_rule);
  plot2d([0:horizon],XX')
  xtitle("Stock trajectories (minimal cost)");
  
  optimal_costs=mean(CC)

Question 6 Increase the unitary holding cost h and decrease the unitary shortage cost b. What do you observe?

Question 7 Decrease the horizon T to T = 4 and the maximal demand to 2. Examine the matrix FEEDBACK. Show that the optimal rule is the form (12). What are the values of the thresholds St,...,ST −1
  0   ?

  horizon=4;
  proba=0.01*[40,20,40];
  demands=[0:(prod(size(proba))-1)];
  controls=[0:10];
  
  [FEEDBACK,state_min]=SDP();

PICT

Figure 1: Optimal stock trajectories

Question 8 Suppose now that there is a fixed cost K > 0 for ordering stock. Simulate optimal trajectories of the stocks and of the controls. What do you observe for the controls trajectories?

  K=30;
  
  // Instantaneous costs function
  function c=instant_costs(xx,uu,ww)
    c=K*sign(uu)+purchasing*uu+SHcosts(xx+uu-ww);
  endfunction

2 Stochastic viability

(De Lara and Doyen2008)

Now, the manager aims at maximizing the probability

           ♭                                         ♯
ℙ {x(t) ≥ x ,cu(t) + h max {0,x(t) + u(t) − w (t)} ≤ C ,  ∀t = t0,...,T −  1}
(13)

that stocks are above a critical level x♭  (to limit unsatisfied clients) and that purchasing and holding costs are bounded above by C.

The dynamic programming equation associated is

(
{  V (T,x)  =   1

(   V(t,x)  =   sup 𝔼[1{x≥x♭}1 {cu+h max{0,x+u−w}≤C ♯} × V (t + 1,x + u − w)],
                u≥0
(14)

where w is a random variable with the distribution p0,...,pN  on the set {0,...,N } .

  ////////////////////////////////////////////////////////////////////////
  //    STOCHASTIC VIABILITY DYNAMIC PROGRAMMING EQUATION 
  ////////////////////////////////////////////////////////////////////////
  
  // The following code contains a bug: 
  // it produces probability value functions larger than 1.
  // 
  
  function [FEEDBACK,state_min]=SVSDP()
    states=[(mini(controls)-maxi(demands))*(horizon): ...
            (maxi(controls)-mini(demands))*(horizon)];
  
    cardinal_states=prod(size(states));
    cardinal_controls=prod(size(controls));
    cardinal_demands=prod(size(demands));
  
    state_min=mini(states);
    state_max=maxi(states);
  
    function i=index(x)
      i=x-state_min+1;
    endfunction
  
    function xdot=dynamics(x,u,w)
      xdot=maxi(state_min,mini(state_max,x+u-w));
    endfunction
  
    VALUE=zeros([0:horizon]'*states);
    FEEDBACK=zeros([0:horizon]'*states);
  
    VALUE(horizon+1,:)=ones(states);
  
    shift=1+[(horizon-1):(-1):1];
    for tt=shift do
      loc=zeros(cardinal_controls,cardinal_states);
      // local variable containg the values of the function to be maximized
      for jj=1:cardinal_controls do
        uu=controls(jj);
        loc(jj,:)=0;
        // the following loop computes an expectation
        for dd=1:cardinal_demands do
          ww=demands(dd);
          loc(jj,:)=loc(jj,:)+ ...
                    proba(dd)* ...
                    (bool2s(states >= stockflat) .* ...
                     bool2s((purchasing*uu+holding*maxi(states+uu-ww,0)) <= costsharp) .* ...
                     VALUE(tt+1,index(dynamics(states,uu,ww))));
        end;
      end
      //  
      [mmn,jjn]=mini(loc,'r');
      [mmx,jjx]=maxi(loc,'r');
      // mm is the extremum achieved
      // jj is the index of the extremum argument
      //
      VALUE(tt,:)=mmx;
      // maximal probability
      FEEDBACK(tt,:)=controls(jjx);
      // optimal feedback 
    end
    //
  endfunction
  
  
  horizon=12;
  proba=0.01*[40,20,30,10];
  transition=ones(proba')*proba;
  demands=[0:(prod(size(proba))-1)];
  controls=[0:2*(prod(size(proba)))];
  
  stockflat=-2;
  costsharp=1000;
  
  [FEEDBACK,state_min]=SVSDP();
  
  // Decision rule
  function u=optimal_viable_rule(t,x)
    tt=t+1;
    u=FEEDBACK(tt,x-state_min+1);
  endfunction
  //
  
  
  // Trajectories simulations and visualization
  xset("window",23);xbasc();
  [XX,UU,CC]=trajectories(simulations,optimal_viable_rule);
  plot2d([0:horizon],XX')
  xtitle("Stock trajectories (maximal viability probability)");
  
  // Take care: CC is not the right criterion
  // The additive criterion should be replaced by a multiplicative one

References

   D. P. Bertsekas. Dynamic Programming and Optimal Control. Athena Scientific, Belmont, Massachusets, second edition, 2000. Volumes 1 and 2.

   M. De Lara and L. Doyen. Sustainable Management of Natural Resources. Mathematical Models and Methods. Springer-Verlag, Berlin, 2008.