JSoC 2019: Using Deep backward stochastic differential equations for solving high-dimensional PDE

Introduction

I have finished the implementation of algorithm for solve PDE using neural network and backward stochastic differential equations (BSDEs)[1] and this blog I would like description main ideas of algorithm and show results.

Brief description of the idea

The method I want implement is based on the BSDE approach and the curse of dimensionality issue is partially avoided by using some machine learning techniques. The method propose a deep learning based technique called Deep BSDE. Based on an Euler discretization of the forward underlying SDE, the idea is to view the BSDE as a forward SDE, and the algorithm tries to learn the values u(variable that compute iteratively in the network) and z = σT(t, x )∇u(t, x) at each time step of the Euler scheme by minimizing a global loss function between the forward simulation of util maturity Tand the target g(x)(terminal condition) (Reference to the article[1]).

We consider the class of semilinear parabolic PDEs of the form:

(1)

Let Wt with t ∈ [0,T] be a d-dimensional Brownian motion and take Xt to be the solution to the stochastic differential equation

(2)

with initial condition X(0) =ζ. Then the solution of (1) satisfies the following BSDE:(3)

Finding the parameters θ which minimize this loss function thus give rise to a BSDE which solves the PDE, , and thus u(t , x) is an approximation of the solution.

Result

Firstly, to need to upload all the packages that are used.

using Pkg
Pkg.add("Flux")
Pkg.add("Test")
Pkg.add("Statistics")
Pkg.add("LinearAlgebra")
Pkg.add("NeuralNetDiffEq")

77.9s

Julia

using Flux
using Test
using Statistics
using LinearAlgebra
using NeuralNetDiffEq

149.9s

Julia

Below is the code of test for our implementation of the algorithm, which solves 20-dimensional Allen-Cahn Equation.

Shift+Enter to run

Julia

# Allen-Cahn Equation
d = 20 # number of dimensions
x0 = fill(0,d)
tspan = (0.0,0.3)
dt = 0.015 # time step
time_steps = div(tspan[2]-tspan[1], dt)
m = 100 # number of trajectories (batch size)

g(X) = 1.0 / (2.0 + 0.4*sum(X.^2))
f(X,Y,Z,p,t) =  Y .- Y.^3  # M x 1
μ(X,p,t) = 0.0
σ(X,p,t) = 1.0
prob = TerminalPDEProblem(g, f, μ, σ, x0, tspan)

hls = 10 + d #hidden layer size
opt = Flux.ADAM(5^-4)  #optimizer
#sub-neural network approximating solutions at the desired point
u0 = Flux.Chain(Dense(d,hls,relu),
                Dense(hls,hls,relu),
                Dense(hls,1))

# sub-neural network approximating the spatial gradients at time point
σᵀ∇u = [Flux.Chain(Dense(d,hls,relu),
                  Dense(hls,hls,relu),
                  Dense(hls,d)) for i in 1 : time_steps]

# hide_layer_size
alg = NNPDEHan(u0, σᵀ∇u, opt = opt)

ans = solve(prob, alg, verbose = true, abstol=1e-8, maxiters = 400, dt=dt, trajectories=m)

prob_ans = 0.30879
error_l2 = sqrt((ans - prob_ans)^2/ans^2)

println("Allen-Cahn equation")
# println("numerical = ", ans)
# println("prob_ans = " , prob_ans)
println("error_l2 = ", error_l2, "\n")
@test error_l2 < 0.1

Shift+Enter to run

Julia

The rest of the tests you can find in the repository of the project NeuralNetDiffEq.jl.

Conclusion

Now I am working on improving the algorithm that we have previously implemented. We want to use neural sde, which should improve the convergence of the algorithm.