MicrostructureNoise

Shota Gugushvili, Moritz Schauer

1.
Introduction

(Available as Jupyter notebook at https://github.com/mschauer/MicrostructureNoise.jl/blob/master/example/MicrostructureNoise.ipynb)
MicrostructureNoise is a Julia package for Bayesian volatility estimation in presence of market microstructure noise implementing the methodology described in our new preprint:
Shota Gugushvili, Frank van der Meulen, Moritz Schauer, and Peter Spreij: Nonparametric Bayesian volatility learning under microstructure noise. arxiv:1805.05606, 2018.
This blogpost gives a short introduction.

1.1.
Description

MicrostructureNoise estimates the volatility function  sss of the stochastic differential equation

dX_t = b(t,X_t) dt + s(t) dW_t, \quad X_0 = x_0, \quad t \in [0,T]

from noisy observations of its solution

Y_i = X(t_i) + V_i, \quad 0 < t_1 < \ldots < t_n = T

where  Vi{V_i}V​i​​ denote unobservable stochastic disturbances. The method is minimalistic in its assumptions on the volatility function, which in particular can itself be a stochastic process.
The estimation methodology is intuitive to understand, given that its ingredients are well-known statistical techniques. The posterior inference is performed via the Gibbs sampler, with the Forward Filtering Backward Simulation algorithm used to reconstruct unobservable states  X(ti)X(t_i)X(t​i​​). This relies on the Kalman filter. The unknown squared volatility function is a priori modelled as piecewise constant and is assigned the inverse Gamma Markov chain prior, which induces smoothing among adjacent pieces of the function. Next to yielding point estimates (e.g. the posterior mean), the method conducts automatic uncertainty quantification via marginal credible bands.

1.2.
Setup

Install MicrostructureNoise via the package manager.

using Pkg
Pkg.add.(["MicrostructureNoise", "Distributions", ]);

ENV["PYTHON"] = ""
Pkg.pkg"add Plots#master PyPlot#master PyCall#master LaTeXStrings#master Conda#master"
Pkg.pkg"precompile"

Then load the packages below.

using MicrostructureNoise, Distributions, Plots
using Random, Dates, LinearAlgebra, DelimitedFiles
pyplot()
Random.seed!(12);

2.
Real data example

As a first example, we apply our methodology to infer volatility of the high frequency foreign exchange rate data made available by Pepperstone Limited, the London based forex broker (https://pepperstone.com/uk/client-resources/historical-tick-data). Specifically, we use the EUR/USD tick data (bid prices) for 2 March 2015. We retrieve, log-transform and subsample the data and express time in milliseconds.

# uncomment if you do not mind to create this large file 
# Base.download("https://www.truefx.com/dev/data//2015/MARCH-2015/EURUSD-2015-03.zip","./data/EURUSD-2015-03.zip")
# run(`unzip ./data/EURUSD-2015-03.zip -d ./data`)
dat = readdlm("../data/EURUSD-2015-03.csv", ',')
times = map(a -> DateTime(a, "yyyymmdd H:M:S.s"), dat[1:10:130260,2])
t = Float64[1/1000*(times[i].instant.periods.value-times[1].instant.periods.value) for i in 1:length(times)]
n = length(t)-1
T = t[end]
y = log.(Float64.(dat[1:10:130260, 3]));

Shift+Enter to run

plot(t[1:10:end], y[1:10:end], label="EUR/USD")

Shift+Enter to run

The prior specification is done via the Prior struct.

prior = MicrostructureNoise.Prior(
N = 40,

α1 = 0.0,
β1 = 0.0,

αη = 0.01, 
βη = 0.01,

Πα = LogNormal(1., 0.5),
μ0 = 0.0,
C0 = 5.0
);

Shift+Enter to run

Most importantly, we set the number of bins N. A second import choice is the hyper-prior of the smoothing parameter Πα. The technical meaning of the other parameter is explained in our paper.
We now sample the posterior using MCMC.

α = 0.3 # Initial smoothness hyperparameter guess
σα = 0.1 # Random walk step size for smoothness hyperparameter

td, θs, ηs, αs, p = MicrostructureNoise.MCMC(prior, t, y, α, σα, 1500, printiter=500);

Shift+Enter to run

The inline help gives a synopsis of the function MCMC (and other functions).

?MicrostructureNoise.MCMC

Shift+Enter to run

text/markdown

posterior = MicrostructureNoise.posterior_volatility(td, θs)

tt, mm = MicrostructureNoise.piecewise(posterior.post_t, posterior.post_median[:])

plot(tt, mm, label="posterior median", color="dark blue")
plot!(MicrostructureNoise.piecewise(posterior.post_t, posterior.post_qlow[:])...,
    fillrange = MicrostructureNoise.piecewise(posterior.post_t, posterior.post_qup[:])[2], 
    fillalpha = 0.2,
    alpha = 0.0,
    fillcolor="blue",
    title="Volatility estimate", label="marginal $(round(Int,posterior.qu*100))% credibility band")

Shift+Enter to run

A histogram of sqrt.(ηs) visualises the posterior of the observation error η, indicating that there is indeed a non-negligible microstructure noise.

histogram(sqrt.(ηs[end÷2:end]), nbins=20, label="eta")

Shift+Enter to run

3.
How good does it work: Test with the Heston model

In order to test our model, we simulate a trajectory of an asset following the Heston model (https://en.wikipedia.org/wiki/Heston_model).
The Heston model is  a widely used stochastic volatility model where it is assumed that the price process of a certain asset, denoted by  SSS, evolves over time according to the SDE

d S_t = \mu S_t d t + \sqrt{Z_t } S_t d W_t,

where the process  ZZZ follows the CIR or square root process,

d Z_t = \kappa ( \theta - Z_t ) d t + \sigma \sqrt{Z_t} d B_t.

Here  WWW and  BBB are correlated Wiener processes with correlation  ρ\rhoρ.
For this generated asset price trajectory we can compare our volatility estimate from noisy data with the true volatility known from the simulation.

3.1.
Generate data from the Heston model

To define the model in Julia and to simulate the trajectory use the package Bridge. Install it via the package manager.

Pkg.add("Bridge")

Shift+Enter to run

Usually we would not perform inference with  SSS, but with its logarithm  Xt=logStX_t=\log S_tX​t​​=logS​t​​. However, we simulate  SSS directly.

using Bridge
using StaticArrays
const R2 = SVector{2,Float64};

Shift+Enter to run

struct Heston <: ContinuousTimeProcess{R2}
    mu::Float64
    kappa::Float64
    theta::Float64
    sigma::Float64
end
Heston(;mu = NaN, kappa = NaN, theta = NaN, sigma = NaN) = Heston(mu, kappa, theta, sigma)

Shift+Enter to run

Define the drift  bbb and the dispersion coefficient  σ\sigmaσ of the process.

Bridge.b(s, x, P::Heston) = R2(P.mu*x[1], P.kappa*(P.theta - x[2]))
Bridge.σ(s, x, P::Heston) = Diagonal(R2(sqrt(x[2])*x[1], P.sigma*sqrt(x[2])))

Shift+Enter to run

We choose the following setting, where the parameter values match what one may expect with real data.

T = 1.0
n = 4000
P = Heston(mu = 0.02, kappa = 7.0, theta = 0.04, sigma = 0.6)

Shift+Enter to run

The process is first simulated on a fine grid and next subsampled at  n=4000n=4000n=4000 random time points. Here we define these grids. Random sampling creates a non-uniform grid, which is what one typically sees with real financial data, where observations are not equispaced in time.

nf = 10 # generate the process on a finer grid than it is observed

tfine0 = T .* sort(rand(n*nf-1))
tfine = [0.0; tfine0; T]

is = [i <= n-1 for i in 1:length(tfine)-2]
is = [true; is[randperm(length(is))]; true]
t = tfine[is];

Shift+Enter to run

Now we generate a sample path with the Euler scheme from correlated Brownian motions, and  log\loglog-transform and subsample the data. By Itô's formula we know the true volatility of the log-transformed process  SSS is in fact the process  ZZZ.

u = R2(1., P.theta)
W = sample(tfine, Wiener{R2}())
ρ = -0.6
map!(v->R2(v[1], ρ*v[1] + sqrt(1-ρ^2)v[2]), W.yy, W.yy) # correlate Brownian motions

Xfine = solve(EulerMaruyama(), u, W, P)

xtrue = log.(first.(Xfine.yy[is]))
s0(t) = sqrt(Xfine.yy[searchsortedfirst(Xfine.tt, t)][2])
y = copy(xtrue) 

η0 = 0.005^2
y .+= randn(n + 1) * sqrt(η0);

Shift+Enter to run

3.2.
Perform inference using `MicrostructureNoise`.

This proceeds along the same lines as before.

α = 5.0 # Initial smoothness hyperparameter guess
σα = 3.0 # Random walk step size for smoothness hyperparameter

td, θs, ηs, αs, p = MicrostructureNoise.MCMC(prior, t, y, α, σα, 10000, printiter=500);

println("acceptance probability $p")

Shift+Enter to run

posterior = MicrostructureNoise.posterior_volatility(td, θs)

tt, mm = MicrostructureNoise.piecewise(posterior.post_t, posterior.post_median[:])

p = plot(tt, mm, label="posterior median", linewidth=0.5, color="dark blue")
plot!(t[1:10:end], (s0.(t[1:10:end])).^2, label="true volatility", color="red")


plot!(MicrostructureNoise.piecewise(posterior.post_t, posterior.post_qlow[:])...,
    fillrange = MicrostructureNoise.piecewise(posterior.post_t, posterior.post_qup[:])[2], 
    fillalpha = 0.2,
    alpha = 0.0,
    fillcolor="darkblue",
    title="Volatility estimate", label="marginal $(round(Int,posterior.qu*100))% credibility band")

display(p)

Shift+Enter to run

The greyish band mostly covers the red curve, the true volatility. Also the posterior median (blue line) globally follows the movement of the true volatility.
This is nice: Without any knowledge of the model, completely ignoring the drift, we still get a nice estimate and reasonable uncertainty quantification of the underlying volatility, and that having only indirect and noisy observations at our hand.
A histogram confirms that also the estimate of the observational noise is close to its true value.

histogram(sqrt.(ηs[end÷2:end]), nbins=20, label="eta")

Shift+Enter to run

4.
References

Shota Gugushvili, Frank van der Meulen, Moritz Schauer, and Peter Spreij: Nonparametric Bayesian volatility estimation. arxiv:1801.09956, 2018.
Shota Gugushvili, Frank van der Meulen, Moritz Schauer, and Peter Spreij: Nonparametric Bayesian volatility learning under microstructure noise. arxiv:1805.05606, 2018.

Shift+Enter to run

MicrostructureNoise

1. Introduction

1.1. Description

1.2. Setup

2. Real data example

3. How good does it work: Test with the Heston model

3.1. Generate data from the Heston model

3.2. Perform inference using MicrostructureNoise.

4. References

1.
Introduction

1.1.
Description

1.2.
Setup

2.
Real data example

3.
How good does it work: Test with the Heston model

3.1.
Generate data from the Heston model

3.2.
Perform inference using `MicrostructureNoise`.

4.
References