JSoC 2019: Migrate from Knet to Flux library.

Migration

Hi, Every one,

I continue to develop library for solving differential equations using neural networks - NeuralNetDiffEq.jl.

One of the first items , It was to support the work of the library with the latest version of Julia (at the time of writing this article v "1.1.0") and perform migrate from Knet and ForwardDiff libraries to the Flux library.

"$VERSION"

0.9s

Julia

"1.1.0"

For this, it was necessary to rewrite outdated constructions that no longer support language and replace them with new ones that are supporting on the current version of Julia. For the migration, in particular, I rewrited the function of inicialization a neural network, the function of the calculate derivitives of neural network and the algorithm of back propagation, using the elemets of the Flux library instead of ForwardDiff and Knet.

The following are parts of the code before and after migration.

Function of inicialization a neural network.

Flux Library allows you to initialize the neural network in one-two lines using Chain (more here). Also the initialization of the neural network was done by the user. The neuron network architecture can be specified in the task solver (sol=solve(prob, nnode(chain, opt),...), where chain = Flux.chain(hide_layer... hide_layer)).

#before
function predict(P, x)
    w, b, v = P
    h = sigm(w * x .+ b)
    return v * h
end
function init_params(ftype,hl_width;atype=KnetArray{Float32})
    P = Array{Any}(3)
    P[1] = randn(ftype,hl_width,1)
    P[2] = zeros(ftype,hl_width,1)
    P[3] = randn(ftype,1,hl_width)
end
#after
chain = Flux.Chain(Dense(1,5,σ),Dense(5,1))

The function of the calculate derivitives of neural network and loss function.

#before
function loss_trial(w, b, v, timepoints, f, p, phi)
    P = [w, b, v]
    dydx(P,x) = ForwardDiff.derivative(x -> phi(P, x), x)
    loss = sum(abs2, [dydx(P, x) - f(phi(P, x), p, x) for x in timepoints])
    return loss
end

#after
dfdx = t -> Tracker.gradient(t -> sum(phi(t)), t; nest = true)[1]
loss = () -> sum(abs2,sum(abs2,dfdx(t) .- f(phi(t)[1],p,t)[1]) for t in ts)

Train process the algorithm of back propagation.

Function Flux.train takes only one line instead of dozens of lines as it was done earlier. More info about Flux.train here.

#before
function train(P, prms, timepoints, f,p,phi)
    g = lossgradient(P,timepoints,f,p,phi)
    update!(P, g, prms)
    return P
end
function lossgradient(P, timepoints, f, p, phi)
    w, b, v = P
    loss∇w = ForwardDiff.gradient(w -> loss_trial(w, b, v, timepoints, f,p,phi), w)
    loss∇b = ForwardDiff.gradient(b -> loss_trial(w, b, v, timepoints, f,p,phi), b)
    loss∇v = ForwardDiff.gradient(v -> loss_trial(w, b, v, timepoints, f,p,phi), v)
    return (loss∇w, loss∇b, loss∇v)
end

#after
Flux.train!(loss, ps, data, opt; cb = cb)

The full code of the project NeuralNetDiffEq.jl can be found here.

Result

Further results and visualization of the updated code.

Firstly, to need to upload all the packages that are used.

using Pkg
Pkg.add("DiffEqBase")
Pkg.add("Flux")
Pkg.add("Plots")
Pkg.add("DifferentialEquations")
Pkg.add("NeuralNetDiffEq")

15.6s

Julia

using DiffEqBase
using Flux
using Plots
using DifferentialEquations
using NeuralNetDiffEq

377.6s

Julia

Solve the ODE: by the boundary condition on the interval, using NeuralNetDiffEq.jl. Uses a neural network with one input and output neuron, one hidden layer with five neuron, sigmoid activation function and ADAM optimizer.

linear = (u,p,t) -> cos(2pi*t)
tspan = (0.0f0, 2.0f0)
u0 = 0.0f0
prob = ODEProblem(linear, u0 ,tspan)
chain = Flux.Chain(Dense(1,5,σ),Dense(5,1))
opt = Flux.ADAM(0.1, (0.9, 0.95))
sol = solve(prob, nnode(chain,opt), dt=1/20f0, verbose = true,
            abstol=1e-10, maxiters = 500)

36.4s

Julia

retcode: Default Interpolation: 1st order linear t: 0.0f0:0.05f0:2.0f0 u: 41-element Array{Float32,1}: 0.0 0.051854 0.0923319 0.120906 0.137221 0.141158 0.132918 0.113121 0.0829268 0.044148 ⋮ -0.087464 -0.116428 -0.136476 -0.145993 -0.143779 -0.129005 -0.101169 -0.0600551 -0.00569224

In the figure below we can observe how the neural network is trained on each epoch observing to decrease the current loss function and comparing the analytical (y2) and intermediate numerical solution (y1).

Conclusion

Now I am working on the implementation of an algorithm for solve PDE using neural network[1] and backward stochastic differential equations (BSDEs)[2][3].

In the next blog, I will try to explain how Bsde is used to solve high dimensional Pde.

JSoC 2019: Migrate from Knet to Flux library.

Migration

Result

Conclusion

Reference