# EnKF.jl : Tools for data assimilation with Ensemble Kalman filter

## 1. Introduction

`EnKF.jl`

is a framework for data assimilation using ensemble Kalman filter in Julia

The objective of this package is to allow a easy and fast setup of data assimilation problems using the ensemble Kalman filter. This package provides tools for:

- constructing data structure for an ensemble of members

- applying covariance inflation (additive, multiplicative, multiplico-additive...) on these ensemble

- setting the data assimilation problem for linear/nonlinear system with linear/nonlinear measurements

### 1.1. References

[1]: Evensen, Geir. "The ensemble Kalman filter: Theoretical formulation and practical implementation." Ocean dynamics 53.4 (2003): 343-367.

[2]: Asch, Mark, Marc Bocquet, and Maëlle Nodet. Data assimilation: methods, algorithms, and applications. Vol. 11. SIAM, 2016.

### 1.2. Installation

This package works on `Julia 1.0 `

and above and is registered in the general Julia registry. In `Julia 1.0 `

REPL, enter the package manager by typing `]`

and then type `add EnKF`

to install the package.

] add EnKF

`EnKF.jl`

is now installed

Then, in any version, type

using EnKF

The plots in this documentation are generated using `Plots.jl`

(http://docs.juliaplots.org/latest/). You might want to install that, too, to follow the examples.

## 2. Review of the Ensemble Kalman Filter

### 2.1. Algorithm

The ensemble Kalman filter is a Monte Carlo view of the Kalman filter applicable to nonlinear systems. Instead of propagating the exact mean and covariance of the Kalman filter. These latter are approximated using an ensemble of size

where

Using the different estimate of the ensemble member, we can approximate the *a priori* mean a

From the small ensemble theory, one can show that the \textit{a priori} covariance is better approximated by averaging by

However, Burgers \cite{burgers1998analysis} show we can not use the same innovation term to correct each ensemble member. Otherwise, the random character of

where for each member

The Kalman gain

### 2.2. Need for covariance inflation

As shown by Van Leeuwen

Fortunately, we can prove the boundedness of the EnKF even for large inflation, Kelly, et al.\cite{kelly2014well}. Whitaker and Hamill\cite{whitaker2012evaluating} have propose to combine a multiplicative and an additive inflation. In practice, we compute the sample mean , apply the covariance inflation procedure to the state (\ref{covariance_inflation}) and then compute the sample covariance. For each ensemble member, the state is transformed through:

where

## 3. Handling states in EnKF.jl

In `EnKF`

, ensemble of members are stored as a mutable structure `EnsembleState{N, TS}`

where `N`

is the number of members and `TS`

an arbitrary type for each member.

### 3.1. Setting your ensemble state

Let's see an example of creating a blank ensemble of 5 members of vector of length 10 and fill each member with a vector full of ones:

ens = EnsembleState(5, zeros(10)) fill!(ens, ones(10)) ens

`EnsembleState`

supports three different constructors:

- The canonical one induced by the definition of `EnsembleState{N, TS} `

by providing an `Array{TS,1}`

:

ens = EnsembleState{5,Array{Float64,1}}([rand(10) for i=1:5])

- Using `EnsembleState(N::Int, u)`

, this one create an ensemble of `N`

blankmembers where each member is set to `zero(u)`

:

ens = EnsembleState(5, ones(10))

- Using `EnsembleState(States::Array{TS,1})`

, this one create an ensemble of length `length(States)``

and fill the` i`

-th member with `States[i]`

: any type of state is supported by `EnKF `

even your own data structures!

struct MyArray B::Array{ComplexF64,1} end A = MyArray(ones(10)+im*ones(10)) ens = EnsembleState(A.B)

### 3.2. Operations on `EnsembleState`

`EnsembleState`

The goal of this subsection is to present all the actions defined for `EnsembleState `

variables. All of them can be found in `/src/state.jl`

`size`

:` `

returns the number of ensemble members and the size of a member

size(ENS::EnsembleState{N, TS}) where {N, TS}

`length`

: returns the number of ensemble members `N`

of an `EnsembleState`

variable

length(ENS::EnsembleState{N, TS}) where {N, TS}

`mean `

: compute the mean of an` EnsembleState`

variable

""" Return the mean of all the ensemble members """ mean(ENS::EnsembleState{N, TS}) where {N, TS}

`deviation`

: compute the deviation from its own mean of an `EnsembleState`

variable

Different methods are available for this function

deviation(ENSfluc::EnsembleState{N, TS}, ENS::EnsembleState{N, TS}) where {N, TS} deviation(tabfluc::Array{TS,2}, ENS::EnsembleState{N, TS}) where {N, TS}

and the in-place version `deviation!`

deviation!(ENS::EnsembleState{N, TS}) where {N, TS}

`hcat`

: performs an horizontal concatenation of an `EnsembleState`

variable (extended version for `EnsembleState`

)

hcat(ENS::EnsembleState{N, TS}) where {N, TS}

`cut`

: cut an array along the different columns and create an `EnsembleState`

variable with these columns

cut(A::AbstractMatrix{TR}) where {TR}

`fill!`

: create an `EnsembleState`

variable filled with a vector

fill!(ENS::EnsembleState{N, TS}, A::TS) where {N, TS}

We can also extend the addition `+`

and subtraction `-`

from the module `Base`

to `EnsembleState `

elements

function (+)(A::EnsembleState{N, TS}, B::EnsembleState{N, TS}) where {N, TS}

(-)(A::EnsembleState{N, TS}, B::EnsembleState{N, TS}) where {N, TS}

For example,

a = EnsembleState(5, zeros(10)) b = EnsembleState(5, zeros(10)) fill!(a, rand(10)) a fill!(b, ones(10)) a + b a - b;

As well as adding or subtracting a vector to each member of an `EnsembleState`

variable

a - 2*ones(10)

## 4. Inflation in EnKF.jl

Due to the covariance collapse (see section 2.2 for a reminder), the state vector need to be inflated.

`EnKF.jl`

is equipped with several classes of inflation: identity inflation, additive inflation, multiplicative inflation, multiplico-additive inflation. Others inflations such as RPRS will be added soon.

Each type of inflation is defined as a subtype of the abstract type `InflationType`

. For each type of inflation, we define an associated structure which contains its parameters and its action on an `EnsembleState`

object.

For a subtype `MyInflation `

which does

""" MyInflation An type to store MyInflation inflation : Define MyInflation inflation: x <- β*x + β^3 with β a scalar # Fields: - 'β' : MyInflation inflation factor """ mutable struct MyInflation <: InflationType β :: Real end """ Define action of MyInflation : x <- β*x + β^3 """ function (A::MyInflation)(ENS::EnsembleState{N, TS}) where {N, TS} for s in ENS.S s .*= β s .+= β^3 end return ENS end

### 4.1. Identity inflation

An identity inflation is of type `IdentityInflation`

returns the inputted `EnsembleState`

variable.

### 4.2. Additive inflation

An additive inflation is stored as a `AdditiveInflation `

variable and adds a vector drawn from a given multivariate distribution `MyDist`

to each member.

In `EnKF.jl`

, distributions are handled with the package `Distributions.jl`

(see its documentation for more details)

By default, the additive inflation is a multivariate distribution with zero mean and unitary covariance matrix (i.e.

""" AdditiveInflation An type to store additive inflation : Define additive inflation: x <- x + α with α a N-dimensional vector drawn from a random distribution # Fields: - 'α' : Distribution of the additive inflation """

Let construct an additive inflation with Gaussian distribution with mean

# Add package Distributions.jl using Pkg Pkg.add.(["Distributions", "LinearAlgebra"]) using Distributions using LinearAlgebra

A = AdditiveInflation(MvNormal([1.0; 2.0; 3.0; 4.0], Diagonal([0.1,0.2,0.3,0.4])))

mean and covariance of A can be obtained by:

mean(A) var(A);

### 4.3. Multiplicative inflation

A multiplicative inflation is stored as a `MultiplicativeInflation `

variable and multiply the deviation from the mean by the multiplicative inflation factor .

""" MultiplicativeInflation An type to store multiplicative inflation : Define multiplicative inflation: x <- x + β*(x - x̂) with β a scalar # Fields: - 'β' : multiplicative inflation factor """ mutable struct MultiplicativeInflation{NS} <: InflationType "Multiplicative inflation factor β" β::Real end

e.g.

M = MultiplicativeInflation(10, 1.02)

### 4.4. Multiplico-additive inflation

A multiplico-additive inflation is stored as a `MultiAdditiveInflation{NS} `

variable and multiply the deviation from the mean by the multiplicative inflation factor

""" MultiAdditiveInflation An type to store multiplico-additive inflation : Define multiplico-additive inflation: x̃⁻ <- x̂⁻ + β*(x̃⁻ - x̂⁻) + α with β a scalar # Fields: - 'β' : Multiplicative inflation factor - 'α' : Distribution of the additive inflation """ mutable struct MultiAdditiveInflation{NS} <: InflationType "Multiplicative inflation factor β" β::Real "Distribution of the additive inflation α" α::MultivariateDistribution end

e.g.

MA = MultiAdditiveInflation(10, 1.0, MvNormal(zeros(10), I))

### 4.5. Exotic inflation

To handle exotic inflation, there is a type `RecipeInflation `

which takes as input a vector of real parameters to define your specific inflation. You only need to provide its action on an `EnsembleState`

variable to make things work !

## 5. Define an ensemble Kalman filter in EnKF.jl

We want users to be able to use `EnKF.jl`

directly into their existing codes, without any modification. To do so, we define the following structures :

### 5.1. Propagation function

`PropagationFunction`

is a type to handle the state advancement of each ensemble member.

To do so, a `PropagationFunction`

function takes as arguments time `t`

and `ens `

the `EnsembleState `

variable and advance the ensemble in time.

For example, this code will multiply each ensemble member by 2.

function (::PropagationFunction)(t::Float64, ens::EnsembleState{N, TS}) where {N, TS} for (i, state) in enumerate(ens.S) state .*= 2.0 end return ens end # Define one PropagationFunction instance fprop = PropagationFunction()

### 5.2. Measurement function

`MeasurementFunction`

is a type to define the state measurement function.

A `MeasurementFunction `

function takes as arguments time `t`

and `state `

a variable of arbitrary type `TS`

and returns the measured quantity

**Note:** `MeasurementFunction `

acts on one member of the ensemble and not the entire ensemble. This choice was motivated by situations where communication is required between different members to propagate the ensemble.

**Note: If you have your measurement is linearly dependent on your state, you also have to provide the associated measurement matrix .**

To do so, we also define the action of `MeasurementFunction`

on time `t`

to provide

For nonlinear systems, we use the technique of state augmentation described by van Leuween, et al. to avoid to provide the measurement matrix.

For example, this code will return the sum of the three first components of the state vector as the measurement.

# Define the measurement vector function (::MeasurementFunction)(t::Float64, s::TS) where TS return [s[1]+s[2]+s[3]] end # Define the measurement matrix H function (::MeasurementFunction)(t::Float64) return reshape([1.0, 1.0 , 1.0],(1,3)) end # Define one MeasurementFunction instance m = MeasurementFunction()

### 5.3. Real measurement function

"""" Define system ENKF which performs the Fields: - 'f' : propagation function - 'A' : inflation - 'G' : filtering function acting on the state - 'm' : measurement function based on state - 'z' : real measurement function - 'ϵ' : measurement noise distribution - 'bounds' : bounds on certain states - 'isinflated' : Bool = true if state is inflated, = false otherwise - 'isfiltered' : Bool = true if state has to be filtered, = false otherwise - 'isaugmented' : Bool = true if measurement function is nonlinear, = false otherwise """ mutable struct ENKF{N, NZ} # "Ensemble of states" # ENS::EnsembleState{N, NS, TS} "Propagation function" f::PropagationFunction "Covariance Inflation" A::Union{InflationType, RecipeInflation} "Filter function" G::FilteringFunction "Measurement function based on state" m::MeasurementFunction "Real measurement function" z::RealMeasurementFunction "Measurement noise distribution" ϵ::AdditiveInflation{NZ} "Boolean: is state vector inflated" isinflated::Bool "Boolean: is state vector filtered" isfiltered::Bool "Boolean: is state vector augmented" isaugmented::Bool # "Bounds on certain state" # bounds end