EnKF.jl : Tools for data assimilation with Ensemble Kalman filter
1. Introduction
EnKF.jl
is a framework for data assimilation using ensemble Kalman filter in Julia
The objective of this package is to allow a easy and fast setup of data assimilation problems using the ensemble Kalman filter. This package provides tools for:
- constructing data structure for an ensemble of members
- applying covariance inflation (additive, multiplicative, multiplico-additive...) on these ensemble
- setting the data assimilation problem for linear/nonlinear system with linear/nonlinear measurements
1.1. References
[1]: Evensen, Geir. "The ensemble Kalman filter: Theoretical formulation and practical implementation." Ocean dynamics 53.4 (2003): 343-367.
[2]: Asch, Mark, Marc Bocquet, and Maëlle Nodet. Data assimilation: methods, algorithms, and applications. Vol. 11. SIAM, 2016.
1.2. Installation
This package works on Julia 1.0
and above and is registered in the general Julia registry. In Julia 1.0
REPL, enter the package manager by typing ]
and then type add EnKF
to install the package.
] add EnKF
EnKF.jl
is now installed
Then, in any version, type
using EnKF
The plots in this documentation are generated using Plots.jl
(http://docs.juliaplots.org/latest/). You might want to install that, too, to follow the examples.
2. Review of the Ensemble Kalman Filter
2.1. Algorithm
The ensemble Kalman filter is a Monte Carlo view of the Kalman filter applicable to nonlinear systems. Instead of propagating the exact mean and covariance of the Kalman filter. These latter are approximated using an ensemble of size
where
Using the different estimate of the ensemble member, we can approximate the a priori mean a
From the small ensemble theory, one can show that the \textit{a priori} covariance is better approximated by averaging by
However, Burgers \cite{burgers1998analysis} show we can not use the same innovation term to correct each ensemble member. Otherwise, the random character of
where for each member
The Kalman gain
2.2. Need for covariance inflation
As shown by Van Leeuwen
Fortunately, we can prove the boundedness of the EnKF even for large inflation, Kelly, et al.\cite{kelly2014well}. Whitaker and Hamill\cite{whitaker2012evaluating} have propose to combine a multiplicative and an additive inflation. In practice, we compute the sample mean , apply the covariance inflation procedure to the state (\ref{covariance_inflation}) and then compute the sample covariance. For each ensemble member, the state is transformed through:
where
3. Handling states in EnKF.jl
In EnKF
, ensemble of members are stored as a mutable structure EnsembleState{N, TS}
where N
is the number of members and TS
an arbitrary type for each member.
3.1. Setting your ensemble state
Let's see an example of creating a blank ensemble of 5 members of vector of length 10 and fill each member with a vector full of ones:
ens = EnsembleState(5, zeros(10)) fill!(ens, ones(10)) ens
EnsembleState
supports three different constructors:
- The canonical one induced by the definition of EnsembleState{N, TS}
by providing an Array{TS,1}
:
ens = EnsembleState{5,Array{Float64,1}}([rand(10) for i=1:5])
- Using EnsembleState(N::Int, u)
, this one create an ensemble of N
blankmembers where each member is set to zero(u)
:
ens = EnsembleState(5, ones(10))
- Using EnsembleState(States::Array{TS,1})
, this one create an ensemble of length length(States)`
and fill the i
-th member with States[i]
: any type of state is supported by EnKF
even your own data structures!
struct MyArray B::Array{ComplexF64,1} end A = MyArray(ones(10)+im*ones(10)) ens = EnsembleState(A.B)
3.2. Operations on EnsembleState
EnsembleState
The goal of this subsection is to present all the actions defined for EnsembleState
variables. All of them can be found in /src/state.jl
size
:
returns the number of ensemble members and the size of a member
size(ENS::EnsembleState{N, TS}) where {N, TS}
length
: returns the number of ensemble members N
of an EnsembleState
variable
length(ENS::EnsembleState{N, TS}) where {N, TS}
mean
: compute the mean of an EnsembleState
variable
""" Return the mean of all the ensemble members """ mean(ENS::EnsembleState{N, TS}) where {N, TS}
deviation
: compute the deviation from its own mean of an EnsembleState
variable
Different methods are available for this function
deviation(ENSfluc::EnsembleState{N, TS}, ENS::EnsembleState{N, TS}) where {N, TS} deviation(tabfluc::Array{TS,2}, ENS::EnsembleState{N, TS}) where {N, TS}
and the in-place version deviation!
deviation!(ENS::EnsembleState{N, TS}) where {N, TS}
hcat
: performs an horizontal concatenation of an EnsembleState
variable (extended version for EnsembleState
)
hcat(ENS::EnsembleState{N, TS}) where {N, TS}
cut
: cut an array along the different columns and create an EnsembleState
variable with these columns
cut(A::AbstractMatrix{TR}) where {TR}
fill!
: create an EnsembleState
variable filled with a vector
fill!(ENS::EnsembleState{N, TS}, A::TS) where {N, TS}
We can also extend the addition +
and subtraction -
from the module Base
to EnsembleState
elements
function (+)(A::EnsembleState{N, TS}, B::EnsembleState{N, TS}) where {N, TS}
(-)(A::EnsembleState{N, TS}, B::EnsembleState{N, TS}) where {N, TS}
For example,
a = EnsembleState(5, zeros(10)) b = EnsembleState(5, zeros(10)) fill!(a, rand(10)) a fill!(b, ones(10)) a + b a - b;
As well as adding or subtracting a vector to each member of an EnsembleState
variable
a - 2*ones(10)
4. Inflation in EnKF.jl
Due to the covariance collapse (see section 2.2 for a reminder), the state vector need to be inflated.
EnKF.jl
is equipped with several classes of inflation: identity inflation, additive inflation, multiplicative inflation, multiplico-additive inflation. Others inflations such as RPRS will be added soon.
Each type of inflation is defined as a subtype of the abstract type InflationType
. For each type of inflation, we define an associated structure which contains its parameters and its action on an EnsembleState
object.
For a subtype MyInflation
which does
""" MyInflation An type to store MyInflation inflation : Define MyInflation inflation: x <- β*x + β^3 with β a scalar # Fields: - 'β' : MyInflation inflation factor """ mutable struct MyInflation <: InflationType β :: Real end """ Define action of MyInflation : x <- β*x + β^3 """ function (A::MyInflation)(ENS::EnsembleState{N, TS}) where {N, TS} for s in ENS.S s .*= β s .+= β^3 end return ENS end
4.1. Identity inflation
An identity inflation is of type IdentityInflation
returns the inputted EnsembleState
variable.
4.2. Additive inflation
An additive inflation is stored as a AdditiveInflation
variable and adds a vector drawn from a given multivariate distribution MyDist
to each member.
In EnKF.jl
, distributions are handled with the package Distributions.jl
(see its documentation for more details)
By default, the additive inflation is a multivariate distribution with zero mean and unitary covariance matrix (i.e.
""" AdditiveInflation An type to store additive inflation : Define additive inflation: x <- x + α with α a N-dimensional vector drawn from a random distribution # Fields: - 'α' : Distribution of the additive inflation """
Let construct an additive inflation with Gaussian distribution with mean
# Add package Distributions.jl using Pkg Pkg.add.(["Distributions", "LinearAlgebra"]) using Distributions using LinearAlgebra
A = AdditiveInflation(MvNormal([1.0; 2.0; 3.0; 4.0], Diagonal([0.1,0.2,0.3,0.4])))
mean and covariance of A can be obtained by:
mean(A) var(A);
4.3. Multiplicative inflation
A multiplicative inflation is stored as a MultiplicativeInflation
variable and multiply the deviation from the mean by the multiplicative inflation factor .
""" MultiplicativeInflation An type to store multiplicative inflation : Define multiplicative inflation: x <- x + β*(x - x̂) with β a scalar # Fields: - 'β' : multiplicative inflation factor """ mutable struct MultiplicativeInflation{NS} <: InflationType "Multiplicative inflation factor β" β::Real end
e.g.
M = MultiplicativeInflation(10, 1.02)
4.4. Multiplico-additive inflation
A multiplico-additive inflation is stored as a MultiAdditiveInflation{NS}
variable and multiply the deviation from the mean by the multiplicative inflation factor
""" MultiAdditiveInflation An type to store multiplico-additive inflation : Define multiplico-additive inflation: x̃⁻ <- x̂⁻ + β*(x̃⁻ - x̂⁻) + α with β a scalar # Fields: - 'β' : Multiplicative inflation factor - 'α' : Distribution of the additive inflation """ mutable struct MultiAdditiveInflation{NS} <: InflationType "Multiplicative inflation factor β" β::Real "Distribution of the additive inflation α" α::MultivariateDistribution end
e.g.
MA = MultiAdditiveInflation(10, 1.0, MvNormal(zeros(10), I))
4.5. Exotic inflation
To handle exotic inflation, there is a type RecipeInflation
which takes as input a vector of real parameters to define your specific inflation. You only need to provide its action on an EnsembleState
variable to make things work !
5. Define an ensemble Kalman filter in EnKF.jl
We want users to be able to use EnKF.jl
directly into their existing codes, without any modification. To do so, we define the following structures :
5.1. Propagation function
PropagationFunction
is a type to handle the state advancement of each ensemble member.
To do so, a PropagationFunction
function takes as arguments time t
and ens
the EnsembleState
variable and advance the ensemble in time.
For example, this code will multiply each ensemble member by 2.
function (::PropagationFunction)(t::Float64, ens::EnsembleState{N, TS}) where {N, TS} for (i, state) in enumerate(ens.S) state .*= 2.0 end return ens end # Define one PropagationFunction instance fprop = PropagationFunction()
5.2. Measurement function
MeasurementFunction
is a type to define the state measurement function.
A MeasurementFunction
function takes as arguments time t
and state
a variable of arbitrary type TS
and returns the measured quantity
Note: MeasurementFunction
acts on one member of the ensemble and not the entire ensemble. This choice was motivated by situations where communication is required between different members to propagate the ensemble.
Note: If you have your measurement is linearly dependent on your state, you also have to provide the associated measurement matrix
To do so, we also define the action of MeasurementFunction
on time t
to provide
For nonlinear systems, we use the technique of state augmentation described by van Leuween, et al. to avoid to provide the measurement matrix.
For example, this code will return the sum of the three first components of the state vector as the measurement.
# Define the measurement vector function (::MeasurementFunction)(t::Float64, s::TS) where TS return [s[1]+s[2]+s[3]] end # Define the measurement matrix H function (::MeasurementFunction)(t::Float64) return reshape([1.0, 1.0 , 1.0],(1,3)) end # Define one MeasurementFunction instance m = MeasurementFunction()
5.3. Real measurement function
"""" Define system ENKF which performs the Fields: - 'f' : propagation function - 'A' : inflation - 'G' : filtering function acting on the state - 'm' : measurement function based on state - 'z' : real measurement function - 'ϵ' : measurement noise distribution - 'bounds' : bounds on certain states - 'isinflated' : Bool = true if state is inflated, = false otherwise - 'isfiltered' : Bool = true if state has to be filtered, = false otherwise - 'isaugmented' : Bool = true if measurement function is nonlinear, = false otherwise """ mutable struct ENKF{N, NZ} # "Ensemble of states" # ENS::EnsembleState{N, NS, TS} "Propagation function" f::PropagationFunction "Covariance Inflation" A::Union{InflationType, RecipeInflation} "Filter function" G::FilteringFunction "Measurement function based on state" m::MeasurementFunction "Real measurement function" z::RealMeasurementFunction "Measurement noise distribution" ϵ::AdditiveInflation{NZ} "Boolean: is state vector inflated" isinflated::Bool "Boolean: is state vector filtered" isfiltered::Bool "Boolean: is state vector augmented" isaugmented::Bool # "Bounds on certain state" # bounds end