We consider a dam manager intenting to maximize the intertemporal payoff obtained by selling
power produced by water releases.
2.1 Dam dynamics
Consider the dam dynamics st+1 = dyn, where
(1)
where
time t ∈ [[t0,T]] is discrete (such as days, weeks or months),
st is the stock level (in water volume) at the beginning of period [t,t + 1[, belonging
to 𝕊 = [[s,s]], where s and s are the minimum and maximum volume of water in the
dam,
the variable ut is the control decided at the beginning of period [t,t + 1[ belonging to
(it can be seen as the period during which the turbine is open, and the
effective water release is min{st + at,ut}, necessarily less than the water st + at in the
dam at the moment of turbinating),
at is the water inflow (rain, hydrology, etc.) during the period [t,t + 1[.
2.2 Intertemporal payoff criterion
We consider a problem of payoff maximization where turbining one unit of water has unitary pricept. On the period from t0 to T, the payoffs sum up to
(2)
where K is the final valorization of the water in the dam.
2.3 Uncertainties and scenarios
Both the inflows at and the prices pt are uncertain variables. We denote by wt := (at,pt) the couple
of uncertainties at time t. A scenario
(3)
is a sequence of uncertainties, inflows and prices, from initial time t0 up to the horizon
T.
2.4 Generation of trajectories and payoff by a policy
A policy ϕ : [[t0,T − 1]] × 𝕊 → 𝕌 assigns a control to any state s of dam stock volume
and to any time t ∈ [[t0,T − 1]].
Given a policy ϕ and a scenario w(⋅) as in (3), we obtain a volume trajectory
and a control trajectory produced by the
“closed-loop” dynamics
(4)
Pluging the trajectories and given by (4) in the criterion (2), we obtain the
evaluation
(5)
2.5 Numerical data
We consider a weekly management over a year, that is t0 = 0 and T = 52, with
(6)
Prices and inflows data can be downloaded using the following html links:
We begin by generating a scenario of inflows and prices as follows.
Question 1Download and execute in Scilab the file dam_management2.sce, so that acertain number of macros are now available. Generate one scenario by calling the macroPricesInflows with the argument set to 1(corresponding to a single scenario generation).
Then, we are going to implement the code corresponding to §2.4 under the form of a macro
simulation_det whose input arguments are a scenario and a policy (itself given under the form of
a macro with two inputs and one output).
Question 2Complete the Scilab macro simulation_det by implementing a time loop frominitial time t0up to the horizon T. Within this loop, follow the dynamics(4) and useformula(5) to compute the payoff. The outputs of this macro will be the gain(5) (scalar),and the state and control trajectories (vectors of sizes T − t0 + 1 and T − t0) given by(4).
This done, we are going to test the above macro simulation_det with simple strategies.
Question 3Using the macro strat_constant, compute the payoff, the state and controltrajectories attached to the scenario generated in Question1. Plot the evolution of thestocks levels as a function of time. Design other policies, like, for instance, the constantstrategies (ut = k for k ∈ [[Dmin,Dmax]]) and the myopic strategy consisting in maximizingthe instantaneous profit. Compare the payoffs given by those differentstrategies.
4 Optimization in a deterministic setting
In a deterministic setting, we consider that the sequences of prices p(⋅) and inflows
w(⋅) are known, and we optimize accordingly. The optimization problem we consider
is
The theoretical Dynamic Programming equation is
(10)
4.1 Zero final value of water
Here, we fix the final value K of water in problem (7) to 0. This means that the water remaining in
the dam at time T presents no economic interest for the manager.
Question 4Write the theoretical Dynamic Programming equation attached to Problem (7).
Then
complete the Scilab macro optim_det that compute the Bellman value of this problem,
complete the Scilab macro simulation_det_Bellman which constructs the optimalstrategy given a Bellman’s Value function,
simulate the stock trajectory using the macro simulation_det,
plot the evolution of the water levels, of the prices and of the controls.
What can you say about the level of water at the end of the period ? Can you explain?
Question 5Theoretically, what other mathematical methods could have been used to solvethe dynamic optimization problem (7)?
4.2 Determining the final value of water
We are optimizing the dam on one year. However at the end of this year the dam manager will still
have to manage the dam, thus the more water in the dam at the end, the better for the next year.
The question is how to determine this value.
The main idea is the following : we want to optimize the management of our dam on a very
long time, however we would like to actually solve the problem only on the first year,
representing the remainings years by the final value function K in (7). Thus K(s) should
represent how much we are going to earn during the remaining time, starting at state
s.
Question 6Consider the optimal strategy sN∗obtained when we solve problem (7) on Nyears, with zero final value (K = 0). Using the Dynamic Programming Principle find thetheoretical function KNsuch that the restriction of the strategy sN∗on the first year isoptimal for the one year problem (Problem 7) with final value K = KN.
Thus, choosing the final value K = KN means that we take in consideration the gains on N
years. We would like to have N going to infinity, however KN+1− KN is more or less the gain
during one year, thus KN will not converge. In the following question we will find a way of
determining a final value converging with N that represents the problem on a long
time.
Question 7Consider the optimal control problem (7) with final value K, and the sameproblem with final value K + c, where c is a constant. What can you say about their optimalstrategies ? their optimal values ?
If K is the value of remaining water, what should be the value of K(s) (in the sense thathow much the future manager of the dam is ready to pay for you to keep the minimum waterlevel in the dam) ?
How do you understand the macro final_value_det ? Test it and comment it. Plot thefinal value obtained as a function of the stock level.
4.3 Introducing a constraint on the water level during summer months
For environemental and touristic reasons the level of water in the dam is constrained. We expect
that, during the summer months (week 25 to 40), the level of water in the dam must be above a
minimal level s′.
Question 8Recall that a constraint can be integrated in the cost function : whenever theconstraint is violated the cost function should be infinite.
Create a Scilab macro optim_det_constrained to integrate this constraint.
Compare the evolution (for different minimal levels s′) of stock trajectories and optimalvalues. What can you say about it ? What should you do in order to compute a final valueof water adapted to the problem with constraints ?
4.4 Closed-loop vs open-loop control
A closed-loop strategy is a policy given by ϕ : [[t0,T − 1]] × 𝕊 → 𝕌, which assigns a water turbined
to any state s of dam stock volume and to any decision period t ∈ [[t0,T − 1]],
whereas an open-loop strategy is a predetermined planning of control, that is a function
ϕ : [[t0,T − 1]] → 𝕌.
Let us note that, formally, an open-loop strategy is a closed-loop strategy.
Question 9In a deterministic setting show that a closed-loop strategy is equivalent toan open-loop strategy in the sense that, for a given initial stock s0, the stock and controltrajectories of (4) will be the same.
Write a Scilab macro that constructs an optimal open-loop strategy from the optimalclosed-loop solution.
However, one can make an error in his prediction on inflows or prices and open-loop
control may suffer from this. In order to represent this, we will proceed in the following
way.
We simulate a scenario of prices and inflows.
We determine the optimal closed-loop strategy via Dynamic Programming.
We determine the associated optimal open-loop strategy.
We test both strategies on the original scenario.
We modify slightly the original scenario (keep in mind that all inflows must be integers)
We test both strategies on the modified scenario.
The “slight” modification of the original scenario must be simple and well understood. Thus we
should change either the price or the inflow, at a few times only. However the size of the
modification can be substantial.
Question 10Write a Scilab macro comparison_openVSclosed_loop that will implementthis procedure and test it. Are there any differencies of value and stock trajectories for theoriginal scenario ? Are there any differencies of value and stock trajectories for the modifiedscenario ? Why ?
In the same macro, compute the optimal strategy for the modified scenario and comparethe results of the open-loop and closed-loop strategies derived from the original scenario tothe optimal result of the modified scenario.
Comment on the pro and cons of closed-loop strategies against open-loop strategies (in adeterministic setting).
5 Optimization in a stochastic setting
In section 3 we have made optimization and simulation on a single scenario. However water inflows
and prices are uncertain, and we will now take that into account.
5.1 Probabilistic model on water inputs and expected criterion
We suppose that sequences of uncertainties , are discrete
random variables with known probability distribution. Moreover we will assume that a(⋅)
and p(⋅) are independent, and that each of them is a sequence of independent random
variables.
Notice that the random variables are independent, but that they are not
necessarily identically distributed. This allows us to account for seasonal effects (more rain in
autumn and winter).
To each strategy ϕ, we associate the expected payoff
(11)
This expected payoff will be estimated by a Monte-Carlo approach. In order to do that we will
use the macros Price and Inflows that generate a table of random trajectories of the noise, each
line being one scenario. The expected payoff of one strategy will be estimated as the empirical
mean of the payoff on these scenarios. In order to compare two strategies we have to use the same
scenarios for the Monte-Carlo estimation. Thus, we fix a set of simulation scenarios (ωi)i∈[[1,n]],
where ωi = {p1i,a1i,,pTi,aTi}. and we will always evaluate the criterion 𝔼Critϕ as
∑i=1NCritϕ(ωi).
Consequently the problem is now written as
5.2 Simulation of strategies in a stochastic setting
Here, we will use the macros simulation and simulation_Bellman that simulate a
strategy on each scenario giving a vector of gains, as well as a matrix of stock and control
trajectories.
Question 11As in Question 1, test the constant strategies and compare the results.
5.3 Open-loop control of the dam in a probabilistic setting
We have seen that, in the deterministic case (without any errors of prevision), an open-loop
strategy is equivalent to a closed-loop strategy. Thus, in a probabilistic setting, one can be tempted
to determine an optimal open-loop strategy.
In a first part, we will work on a mean scenario to derive an open-loop strategy.
Question 12Complete the macro simu_mean_scenario, using the macros from thedeterministic study, to compute the optimal strategy for the mean scenario.
In a second part we compute the best open-loop strategy using the function optim
built-in in Scilab. We choose a set of optimization scenarios (ω′i)i∈[[1,Nmc]], where
ω′i = {p′1i,a′1i,,p′Ti,a′Ti}. (let us note that this set of scenarios is fixed and that it is
different from the set of simulation scenarios). Then we construct a cost function J(u)
as
where u is a vector of T − 1 variables reprensenting the planning of control. Thus we
have
with
Question 13Use the macro best_open_loop to obtain the best possible open-loop strategy.Test it and compare to the strategy obtained for the mean scenario. You will consider thesimulation of both strategies on the optimization scenarios and on the simulation scenarios.1
5.4 Stochastic Dynamic Programming Equation
5.4.1 Decision-Hazard framework
We will now focus on finding an optimal closed loop solution for problem (12) The dynamic
programming equation associated to the problem of maximizing the expected profits
is
(16)
Question 14Complete the function DP, that solves the dynamic programming equation(We consider that K = 0).
Then write a macro simulation_Bellman_DH that will simulate the optimal strategy ona set of simulation scenario.
Plot an histogram of the payoffs and plot an evolution of the stocks level. Compare thegains obtained with this strategy to the open-loop strategy derived from the mean-scenario.You can also compare this strategy to the optimal open-loop strategy.
5.4.2 Hazard-Decision framework
One may note that, in practice, the dam manager often assume that the weekly inflows and prices
are perfectly known. Indeed at the beginning of the week meteorologists and economists can give
some predictions. Moreover this problem is only an approximation of the real one, as a dam is
managed per hour and not per week, thus the manager has more information than
what we assume in a Decision-Hazard setting. Consequently we will now change slightly
problem (12) by assuming that, at each time step t we know the price pt and inflow
at.
Question 15Write a macro DP_HD that will solve problem (17) in a hazard-decision setting.Test it and compare to the solution from the decision-hazard setting (question 14).
5.5 (Anticipative) upper bound for the payoff
The choice of the probabilistic model of noises (prices and inflows) is quite important. Until now,
we have represented the noises as independent variables, and this is not the more precise
probabilistic model we could have used. Consequently we might want to estimate the potential
gain in using a more precise (but numerically less tractable) probabilistic model. Thus we would
like to have an upper bound on our problem. Such an upper bound can be found by doing
an anticipative study : for each scenario we compute the best possible gains on this
scenario.
Let us stress out that this will not give a strategy that can be used. It only gives an upper
bound on the possible gain for a set of simulation scenario, a-posteriori.
Question 16Write a macro Simu_anticipative, that computes for each scenario theupper bound given by the deterministic optimisation. Compare the results obtained by thedifferents strategies with this upper bound.