JSoC 2019: Using Deep backward stochastic differential equations for solving high-dimensional PDE
Introduction
I have finished the implementation of algorithm for solve PDE using neural network and backward stochastic differential equations (BSDEs)[1] and this blog I would like description main ideas of algorithm and show results.
Brief description of the idea
The method I want implement is based on the BSDE approach and the curse of dimensionality issue is partially avoided by using some machine learning techniques. The method propose a deep learning based technique called Deep BSDE. Based on an Euler discretization of the forward underlying SDE, the idea is to view the BSDE as a forward SDE, and the algorithm tries to learn the values u(variable that compute iteratively in the network) and z = σT(t, x )∇u(t, x) at each time step of the Euler scheme by minimizing a global loss function between the forward simulation of util maturity Tand the target g(x)(terminal condition) (Reference to the article[1]).
We consider the class of semilinear parabolic PDEs of the form:
Let Wt with t ∈ [0,T] be a d-dimensional Brownian motion and take Xt to be the solution to the stochastic differential equation
with initial condition X(0) =ζ. Then the solution of (1) satisfies the following BSDE:
Finding the parameters θ which minimize this loss function thus give rise to a BSDE which solves the PDE, , and thus u(t , x) is an approximation of the solution.
Result
Firstly, to need to upload all the packages that are used.
using Pkg Pkg.add("Flux") Pkg.add("Test") Pkg.add("Statistics") Pkg.add("LinearAlgebra") Pkg.add("NeuralNetDiffEq")
using Flux using Test using Statistics using LinearAlgebra using NeuralNetDiffEq
Below is the code of test for our implementation of the algorithm, which solves 20-dimensional Allen-Cahn Equation.
# Allen-Cahn Equation d = 20 # number of dimensions x0 = fill(0,d) tspan = (0.0,0.3) dt = 0.015 # time step time_steps = div(tspan[2]-tspan[1], dt) m = 100 # number of trajectories (batch size) g(X) = 1.0 / (2.0 + 0.4*sum(X.^2)) f(X,Y,Z,p,t) = Y .- Y.^3 # M x 1 μ(X,p,t) = 0.0 σ(X,p,t) = 1.0 prob = TerminalPDEProblem(g, f, μ, σ, x0, tspan) hls = 10 + d #hidden layer size opt = Flux.ADAM(5^-4) #optimizer #sub-neural network approximating solutions at the desired point u0 = Flux.Chain(Dense(d,hls,relu), Dense(hls,hls,relu), Dense(hls,1)) # sub-neural network approximating the spatial gradients at time point σᵀ∇u = [Flux.Chain(Dense(d,hls,relu), Dense(hls,hls,relu), Dense(hls,d)) for i in 1 : time_steps] # hide_layer_size alg = NNPDEHan(u0, σᵀ∇u, opt = opt) ans = solve(prob, alg, verbose = true, abstol=1e-8, maxiters = 400, dt=dt, trajectories=m) prob_ans = 0.30879 error_l2 = sqrt((ans - prob_ans)^2/ans^2) println("Allen-Cahn equation") # println("numerical = ", ans) # println("prob_ans = " , prob_ans) println("error_l2 = ", error_l2, "\n") error_l2 < 0.1
The rest of the tests you can find in the repository of the project NeuralNetDiffEq.jl.
Conclusion
Now I am working on improving the algorithm that we have previously implemented. We want to use neural sde, which should improve the convergence of the algorithm.