Optimal Dam Management (Deterministic and Under Uncertainty)

Michel De Lara et Vincent Leclère
(last modiﬁcation date: October 10, 2017)

Version pdf de ce document
Version sans bandeaux

1 Prerequisite Scilab code
2 Problem statement
3 Evaluation of a strategy
4 Optimization in a deterministic setting
5 Optimization in a stochastic setting
6 Correction

1 Prerequisite Scilab code
2 Problem statement
2.1 Dam dynamics
2.2 Intertemporal payoﬀ criterion
2.3 Uncertainties and scenarios
2.4 Generation of trajectories and payoﬀ by a policy
2.5 Numerical data
3 Evaluation of a strategy
4 Optimization in a deterministic setting
4.1 Zero ﬁnal value of water
4.2 Determining the ﬁnal value of water
4.3 Introducing a constraint on the water level during summer months
4.4 Closed-loop vs open-loop control
5 Optimization in a stochastic setting
5.1 Probabilistic model on water inputs and expected criterion
5.2 Simulation of strategies in a stochastic setting
5.3 Open-loop control of the dam in a probabilistic setting
5.4 Stochastic Dynamic Programming Equation
5.4.1 Decision-Hazard framework
5.4.2 Hazard-Decision framework
5.5 (Anticipative) upper bound for the payoﬀ
6 Correction

1 Prerequisite Scilab code

You will have to ﬁll the following skeleton of scilab code during the practical course.

dam_management2.sce

2 Problem statement

We consider a dam manager intenting to maximize the intertemporal payoﬀ obtained by selling power produced by water releases.

2.1 Dam dynamics

Consider the dam dynamics s_t+1 = dyn (st,ut,at) , where

dyn (s, u,a) = max {s,min {¯s,s − u + a}}

(1)

where

time t ∈ [[t₀,T]] is discrete (such as days, weeks or months),
s_t is the stock level (in water volume) at the beginning of period [t,t + 1[, belonging to 𝕊 = [[s,s]], where s and s are the minimum and maximum volume of water in the dam,
the variable u_t is the control decided at the beginning of period [t,t + 1[ belonging to $𝕌 = [[0, ¯u]]$ (it can be seen as the period during which the turbine is open, and the eﬀective water release is min{s_t + a_t,u_t}, necessarily less than the water s_t + a_t in the dam at the moment of turbinating),
a_t is the water inﬂow (rain, hydrology, etc.) during the period [t,t + 1[.

2.2 Intertemporal payoﬀ criterion

We consider a problem of payoﬀ maximization where turbining one unit of water has unitary price p_t. On the period from t₀ to T, the payoﬀs sum up to

T∑− 1 ptmin {st + at,ut} + K (sT), t=t0

(2)

where K is the ﬁnal valorization of the water in the dam.

2.3 Uncertainties and scenarios

Both the inﬂows a_t and the prices p_t are uncertain variables. We denote by w_t := (a_t,p_t) the couple of uncertainties at time t. A scenario

w (⋅) := (wt0,⋅⋅⋅,wT )

(3)

is a sequence of uncertainties, inﬂows and prices, from initial time t₀ up to the horizon T.

2.4 Generation of trajectories and payoﬀ by a policy

A policy ϕ : [[t₀,T − 1]] × 𝕊 → 𝕌 assigns a control $u = ϕ(t,s)$ to any state s of dam stock volume and to any time t ∈ [[t₀,T − 1]].

Given a policy ϕ and a scenario w(⋅) as in (3), we obtain a volume trajectory $s(⋅) := (st0,...,sT−1,sT)$ and a control trajectory $u (⋅) := (ut0,...,uT −1)$ produced by the “closed-loop” dynamics

st0 = s0 ut = ϕ(t,st) st+1 = dyn(st,ut,at).

(4)

Pluging the trajectories $s(⋅)$ and $u(⋅)$ given by (4) in the criterion (2), we obtain the evaluation

T∑−1 Critϕ (t ,s ) := p min {s + a ,u } + K (s ). 0 0 t t t t T t=t0

(5)

2.5 Numerical data

We consider a weekly management over a year, that is t₀ = 0 and T = 52, with

s0 = 30 hm3, s = 1 hm3, ¯s = 60 hm3, ¯u = 10 hm3.

(6)

Prices and inﬂows data can be downloaded using the following html links:

3 Evaluation of a strategy

We begin by generating a scenario of inﬂows and prices as follows.

Question 1 Download and execute in Scilab the ﬁle dam_management2.sce, so that a certain number of macros are now available. Generate one scenario by calling the macro PricesInflows with the argument set to 1 (corresponding to a single scenario generation).

Then, we are going to implement the code corresponding to §2.4 under the form of a macro simulation_det whose input arguments are a scenario and a policy (itself given under the form of a macro with two inputs and one output).

Question 2 Complete the Scilab macro simulation_det by implementing a time loop from initial time t₀ up to the horizon T. Within this loop, follow the dynamics (4) and use formula (5) to compute the payoﬀ. The outputs of this macro will be the gain (5) (scalar), and the state and control trajectories (vectors of sizes T − t₀ + 1 and T − t₀) given by (4).

This done, we are going to test the above macro simulation_det with simple strategies.

Question 3 Using the macro strat_constant, compute the payoﬀ, the state and control trajectories attached to the scenario generated in Question 1. Plot the evolution of the stocks levels as a function of time. Design other policies, like, for instance, the constant strategies (u_t = k for k ∈ [[Dmin,Dmax]]) and the myopic strategy consisting in maximizing the instantaneous proﬁt $pt min{st + at,ut}$ . Compare the payoﬀs given by those diﬀerent strategies.

4 Optimization in a deterministic setting

In a deterministic setting, we consider that the sequences of prices p(⋅) and inﬂows w(⋅) are known, and we optimize accordingly. The optimization problem we consider is

The theoretical Dynamic Programming equation is

( V (T,s) = K (s) , |||| ◟◝◜-◞ { final pr⌊ofit ⌋ |||| V (t,s) = max ⌈pt min {st + at,ut} +V (t + 1, dyn (st,ut,at) )⌉ ( 0≤u≤¯u ◟-------◝◜------◞ ◟-----◝◜----◞ instantaneous profit future stock level

(10)

4.1 Zero ﬁnal value of water

Here, we ﬁx the ﬁnal value K of water in problem (7) to 0. This means that the water remaining in the dam at time T presents no economic interest for the manager.

Question 4 Write the theoretical Dynamic Programming equation attached to Problem (7).

Then

complete the Scilab macro optim_det that compute the Bellman value of this problem,
complete the Scilab macro simulation_det_Bellman which constructs the optimal strategy given a Bellman’s Value function,
simulate the stock trajectory using the macro simulation_det,
plot the evolution of the water levels, of the prices and of the controls.

What can you say about the level of water at the end of the period ? Can you explain ?

Question 5 Theoretically, what other mathematical methods could have been used to solve the dynamic optimization problem (7)?

4.2 Determining the ﬁnal value of water

We are optimizing the dam on one year. However at the end of this year the dam manager will still have to manage the dam, thus the more water in the dam at the end, the better for the next year. The question is how to determine this value.

The main idea is the following : we want to optimize the management of our dam on a very long time, however we would like to actually solve the problem only on the ﬁrst year, representing the remainings years by the ﬁnal value function K in (7). Thus K(s) should represent how much we are going to earn during the remaining time, starting at state s.

Question 6 Consider the optimal strategy s_N^∗ obtained when we solve problem (7) on N years, with zero ﬁnal value (K = 0). Using the Dynamic Programming Principle ﬁnd the theoretical function K_N such that the restriction of the strategy s_N^∗ on the ﬁrst year is optimal for the one year problem (Problem 7) with ﬁnal value K = K_N.

Thus, choosing the ﬁnal value K = K_N means that we take in consideration the gains on N years. We would like to have N going to inﬁnity, however K_N+1 − K_N is more or less the gain during one year, thus K_N will not converge. In the following question we will ﬁnd a way of determining a ﬁnal value converging with N that represents the problem on a long time.

Question 7 Consider the optimal control problem (7) with ﬁnal value K, and the same problem with ﬁnal value K + c, where c is a constant. What can you say about their optimal strategies ? their optimal values ?

If K is the value of remaining water, what should be the value of K(s) (in the sense that how much the future manager of the dam is ready to pay for you to keep the minimum water level in the dam) ?

How do you understand the macro final_value_det ? Test it and comment it. Plot the ﬁnal value obtained as a function of the stock level.

4.3 Introducing a constraint on the water level during summer months

For environemental and touristic reasons the level of water in the dam is constrained. We expect that, during the summer months (week 25 to 40), the level of water in the dam must be above a minimal level s′.

Question 8 Recall that a constraint can be integrated in the cost function : whenever the constraint is violated the cost function should be inﬁnite.

Create a Scilab macro optim_det_constrained to integrate this constraint.

Compare the evolution (for diﬀerent minimal levels s′) of stock trajectories and optimal values. What can you say about it ? What should you do in order to compute a ﬁnal value of water adapted to the problem with constraints ?

4.4 Closed-loop vs open-loop control

A closed-loop strategy is a policy given by ϕ : [[t₀,T − 1]] × 𝕊 → 𝕌, which assigns a water turbined $u = ϕ(t,s)$ to any state s of dam stock volume and to any decision period t ∈ [[t₀,T − 1]], whereas an open-loop strategy is a predetermined planning of control, that is a function ϕ : [[t₀,T − 1]] → 𝕌.

Let us note that, formally, an open-loop strategy is a closed-loop strategy.

Question 9 In a deterministic setting show that a closed-loop strategy is equivalent to an open-loop strategy in the sense that, for a given initial stock s₀, the stock and control trajectories of (4) will be the same.

Write a Scilab macro that constructs an optimal open-loop strategy from the optimal closed-loop solution.

However, one can make an error in his prediction on inﬂows or prices and open-loop control may suﬀer from this. In order to represent this, we will proceed in the following way.

We simulate a scenario of prices and inﬂows.
We determine the optimal closed-loop strategy via Dynamic Programming.
We determine the associated optimal open-loop strategy.
We test both strategies on the original scenario.
We modify slightly the original scenario (keep in mind that all inﬂows must be integers)
We test both strategies on the modiﬁed scenario.

The “slight” modiﬁcation of the original scenario must be simple and well understood. Thus we should change either the price or the inﬂow, at a few times only. However the size of the modiﬁcation can be substantial.

Question 10 Write a Scilab macro comparison_openVSclosed_loop that will implement this procedure and test it. Are there any diﬀerencies of value and stock trajectories for the original scenario ? Are there any diﬀerencies of value and stock trajectories for the modiﬁed scenario ? Why ?

In the same macro, compute the optimal strategy for the modiﬁed scenario and compare the results of the open-loop and closed-loop strategies derived from the original scenario to the optimal result of the modiﬁed scenario.

Comment on the pro and cons of closed-loop strategies against open-loop strategies (in a deterministic setting).

5 Optimization in a stochastic setting

In section 3 we have made optimization and simulation on a single scenario. However water inﬂows and prices are uncertain, and we will now take that into account.

5.1 Probabilistic model on water inputs and expected criterion

We suppose that sequences of uncertainties $(at0,...,aT− 1)$ , $(pt0,...,pT −1)$ are discrete random variables with known probability distribution. Moreover we will assume that a(⋅) and p(⋅) are independent, and that each of them is a sequence of independent random variables.

Notice that the random variables $(at0,...,aT −1)$ are independent, but that they are not necessarily identically distributed. This allows us to account for seasonal eﬀects (more rain in autumn and winter).

To each strategy ϕ, we associate the expected payoﬀ

ϕ T∑ −1 𝔼 [Crit (t0,s0)] = 𝔼 [ ptmin {st + at,ut} + K (sT )]. t=t0

(11)

This expected payoﬀ will be estimated by a Monte-Carlo approach. In order to do that we will use the macros Price and Inflows that generate a table of random trajectories of the noise, each line being one scenario. The expected payoﬀ of one strategy will be estimated as the empirical mean of the payoﬀ on these scenarios. In order to compare two strategies we have to use the same scenarios for the Monte-Carlo estimation. Thus, we ﬁx a set of simulation scenarios (ω_i)_i∈[[1,n]], where ω_i = {p₁ⁱ,a₁ⁱ, ⋅⋅⋅ ,p_Tⁱ,a_Tⁱ}. and we will always evaluate the criterion 𝔼Crit^ϕ as 1
N- ∑ _i=1^NCrit^ϕ(ω_i).

Consequently the problem is now written as

5.2 Simulation of strategies in a stochastic setting

Here, we will use the macros simulation and simulation_Bellman that simulate a strategy on each scenario giving a vector of gains, as well as a matrix of stock and control trajectories.

Question 11 As in Question 1, test the constant strategies and compare the results.

5.3 Open-loop control of the dam in a probabilistic setting

We have seen that, in the deterministic case (without any errors of prevision), an open-loop strategy is equivalent to a closed-loop strategy. Thus, in a probabilistic setting, one can be tempted to determine an optimal open-loop strategy.

In a ﬁrst part, we will work on a mean scenario to derive an open-loop strategy.

Question 12 Complete the macro simu_mean_scenario, using the macros from the deterministic study, to compute the optimal strategy for the mean scenario.

In a second part we compute the best open-loop strategy using the function optim built-in in Scilab. We choose a set of optimization scenarios (ω′_i)_{i∈[[1,N_mc]]}, where ω′_i = {p′₁ⁱ,a′₁ⁱ, ⋅⋅⋅ ,p′_Tⁱ,a′_Tⁱ}. (let us note that this set of scenarios is ﬁxed and that it is diﬀerent from the set of simulation scenarios). Then we construct a cost function J(u) as

Nmc --1--∑ ′i J(u) := Nmc Crit(u)(ω ) i=1

where u is a vector of T − 1 variables reprensenting the planning of control. Thus we have

′i T∑ −1 i i ′i i Crit(u)(ω ) = ptmin {st + at ,ut} + K (sT) t=t0

with

i i ′i st+1 = dyn {st,ut,a t }

Question 13 Use the macro best_open_loop to obtain the best possible open-loop strategy. Test it and compare to the strategy obtained for the mean scenario. You will consider the simulation of both strategies on the optimization scenarios and on the simulation scenarios.¹

5.4 Stochastic Dynamic Programming Equation

5.4.1 Decision-Hazard framework

We will now focus on ﬁnding an optimal closed loop solution for problem (12) The dynamic programming equation associated to the problem of maximizing the expected proﬁts is

( || V (T, s) = K-(s-) , { fin◟a◝l◜ p◞rofit ||( V (t,s) = 0m≤aux≤ ¯u𝔼 [p◟tmin{st◝◜+-at,ut}◞+V (t + 1, d◟yn-(st◝,◜ut,at)◞)], instantaneous profit future stock level

(16)

Question 14 Complete the function DP, that solves the dynamic programming equation (We consider that K = 0).

Then write a macro simulation_Bellman_DH that will simulate the optimal strategy on a set of simulation scenario.

Plot an histogram of the payoﬀs and plot an evolution of the stocks level. Compare the gains obtained with this strategy to the open-loop strategy derived from the mean-scenario. You can also compare this strategy to the optimal open-loop strategy.

5.4.2 Hazard-Decision framework

One may note that, in practice, the dam manager often assume that the weekly inﬂows and prices are perfectly known. Indeed at the beginning of the week meteorologists and economists can give some predictions. Moreover this problem is only an approximation of the real one, as a dam is managed per hour and not per week, thus the manager has more information than what we assume in a Decision-Hazard setting. Consequently we will now change slightly problem (12) by assuming that, at each time step t we know the price p_t and inﬂow a_t.

Problem (12) is turned into

Question 15 Write a macro DP_HD that will solve problem (17) in a hazard-decision setting. Test it and compare to the solution from the decision-hazard setting (question 14).

5.5 (Anticipative) upper bound for the payoﬀ

The choice of the probabilistic model of noises (prices and inﬂows) is quite important. Until now, we have represented the noises as independent variables, and this is not the more precise probabilistic model we could have used. Consequently we might want to estimate the potential gain in using a more precise (but numerically less tractable) probabilistic model. Thus we would like to have an upper bound on our problem. Such an upper bound can be found by doing an anticipative study : for each scenario we compute the best possible gains on this scenario.

Let us stress out that this will not give a strategy that can be used. It only gives an upper bound on the possible gain for a set of simulation scenario, a-posteriori.

Question 16 Write a macro Simu_anticipative, that computes for each scenario the upper bound given by the deterministic optimisation. Compare the results obtained by the diﬀerents strategies with this upper bound.

6 Correction

dam_management2_correction.sce

Formations

Départements

Pratique

Recherche

Management of Energy

Management of Renewable Resources and Biodiversity

Management of Exhaustible Resources

Models in Life Science

Miscellaneous