Kirill Zubov / Aug 20 2019
Remix of Julia by Nextjournal

JSoC 2019: Migrate from Knet to Flux library.


Hi, Every one,

I continue to develop library for solving differential equations using neural networks - NeuralNetDiffEq.jl.

One of the first items , It was to support the work of the library with the latest version of Julia (at the time of writing this article v "1.1.0") and perform migrate from Knet and ForwardDiff libraries to the Flux library.


For this, it was necessary to rewrite outdated constructions that no longer support language ​​and replace them with new ones that are supporting on the current version of Julia. For the migration, in particular, I rewrited the function of inicialization a neural network, the function of the calculate derivitives of neural network and the algorithm of back propagation, using the elemets of the Flux library instead of ForwardDiff and Knet.

The following are parts of the code before and after migration.

Function of inicialization a neural network.

Flux Library allows you to initialize the neural network in one-two lines using Chain (more here). Also the initialization of the neural network was done by the user. The neuron network architecture can be specified in the task solver (sol=solve(prob, nnode(chain, opt),...), where chain = Flux.chain(hide_layer... hide_layer)).

function predict(P, x)
    w, b, v = P
    h = sigm(w * x .+ b)
    return v * h
function init_params(ftype,hl_width;atype=KnetArray{Float32})
    P = Array{Any}(3)
    P[1] = randn(ftype,hl_width,1)
    P[2] = zeros(ftype,hl_width,1)
    P[3] = randn(ftype,1,hl_width)
chain = Flux.Chain(Dense(1,5,σ),Dense(5,1))

The function of the calculate derivitives of neural network and loss function.

function loss_trial(w, b, v, timepoints, f, p, phi)
    P = [w, b, v]
    dydx(P,x) = ForwardDiff.derivative(x -> phi(P, x), x)
    loss = sum(abs2, [dydx(P, x) - f(phi(P, x), p, x) for x in timepoints])
    return loss

dfdx = t -> Tracker.gradient(t -> sum(phi(t)), t; nest = true)[1]
loss = () -> sum(abs2,sum(abs2,dfdx(t) .- f(phi(t)[1],p,t)[1]) for t in ts)

Train process the algorithm of back propagation.

Function Flux.train takes only one line instead of dozens of lines as it was done earlier. More info about Flux.train here.

function train(P, prms, timepoints, f,p,phi)
    g = lossgradient(P,timepoints,f,p,phi)
    update!(P, g, prms)
    return P
function lossgradient(P, timepoints, f, p, phi)
    w, b, v = P
    loss∇w = ForwardDiff.gradient(w -> loss_trial(w, b, v, timepoints, f,p,phi), w)
    loss∇b = ForwardDiff.gradient(b -> loss_trial(w, b, v, timepoints, f,p,phi), b)
    loss∇v = ForwardDiff.gradient(v -> loss_trial(w, b, v, timepoints, f,p,phi), v)
    return (loss∇w, loss∇b, loss∇v)

Flux.train!(loss, ps, data, opt; cb = cb)

The full code of the project NeuralNetDiffEq.jl can be found here.


Further results and visualization of the updated code.

Firstly, to need to upload all the packages that are used.

using Pkg
using DiffEqBase
using Flux
using Plots
using DifferentialEquations
using NeuralNetDiffEq

Solve the ODE: by the boundary condition on the interval, using NeuralNetDiffEq.jl. Uses a neural network with one input and output neuron, one hidden layer with five neuron, sigmoid activation function and ADAM optimizer.

linear = (u,p,t) -> cos(2pi*t)
tspan = (0.0f0, 2.0f0)
u0 = 0.0f0
prob = ODEProblem(linear, u0 ,tspan)
chain = Flux.Chain(Dense(1,5,σ),Dense(5,1))
opt = Flux.ADAM(0.1, (0.9, 0.95))
sol = solve(prob, nnode(chain,opt), dt=1/20f0, verbose = true,
            abstol=1e-10, maxiters = 500)
retcode: Default Interpolation: 1st order linear t: 0.0f0:0.05f0:2.0f0 u: 41-element Array{Float32,1}: 0.0 0.051854 0.0923319 0.120906 0.137221 0.141158 0.132918 0.113121 0.0829268 0.044148 ⋮ -0.087464 -0.116428 -0.136476 -0.145993 -0.143779 -0.129005 -0.101169 -0.0600551 -0.00569224

In the figure below we can observe how the neural network is trained on each epoch observing to decrease the current loss function and comparing the analytical (y2) and intermediate numerical solution (y1).


Now I am working on the implementation of an algorithm for solve PDE using neural network[1] and backward stochastic differential equations (BSDEs)[2][3].

In the next blog, I will try to explain how Bsde is used to solve high dimensional Pde.