Why you shouldn't use Euler's method to solve ODEs

Hey, I'm Chris Rackauckas, the author of StochasticLifestyle.com and DifferentialEquations.jl, and I wanted to try out this new cloud code blogging tool offered by Next Journal. This article is run using Julia on cloud servers through this free service, an idea which I think is pretty neat!

In this quick article I'll use the tools in DifferentialEquations.jl to show why you shouldn't use Euler's method (or RK4) to solve ODEs. In general it's unreliable and usually slower than alternatives. We will investigate a chaotic model of random neural networks and show both of these properties.

pkg"update"
pkg"add Plots DifferentialEquations DiffEqDevTools"
pkg"precompile"
pkg"status"

For our setup we will write down the ODE function and calculate a reference solution using a 9th order integrator at very low error tolerances.

using Plots, DifferentialEquations, Random, LinearAlgebra

Random.seed!(1)

const g = 2.0
const N = 100
const T = 200.0
const J = g*randn(N,N)/sqrt(N)
h0 = randn(N)
tspan = (0.0,T)
r = Array{Float64}(undef,N)
function f(dh,h,p,t)
    dh .= h
    map!(tanh,r,h)
    BLAS.gemv!('N',1.,J,r,-1.0,dh)
end
prob = ODEProblem(f,h0,tspan)
sol = solve(prob,Vern8(),abstol=1/10^14,reltol=1/10^14,maxiters=10000000)
plot(sol,vars=(0,1))

Ballpark Demonstration

Let's do some ballpark benchmarking to see what the timing vs error looks like.

# Precompile
sol1 = solve(prob,Euler(),dt=0.1)
sol2 = solve(prob,Euler(),dt=0.01)
sol3 = solve(prob,Euler(),dt=0.001)
sol4 = solve(prob,Tsit5())
sol5 = solve(prob,Vern9(),abstol=1e-13,reltol=1e-13)

# Times
t1 = @elapsed sol1 = solve(prob,Euler(),dt=0.1)
t2 = @elapsed sol2 = solve(prob,Euler(),dt=0.01)
t3 = @elapsed sol3 = solve(prob,Euler(),dt=0.001)
t4 = @elapsed sol4 = solve(prob,Tsit5())
t5 = @elapsed sol5 = solve(prob,Vern9(),abstol=1e-13,reltol=1e-13)

times = """
    Times:
    Euler dt=0.1    $t1
    Euler dt=0.01   $t2
    Euler dt=0.001  $t3
    Tsit5           $t4
    Vern9           $t5
"""
println(times)

Here we see that the 5th order adaptive timestepping method Tsit5 is faster than the dt=0.1 Euler method. But how does it compare in terms of accuracy? Let's make an interactive plot to check against the reference solution.

plt = plot(sol1, vars=(0,1), label="Euler dt=0.1")

plot!(plt, sol2, vars=(0,1), label="Euler dt=0.01")

plot!(plt, sol3, vars=(0,1), label="Euler dt=0.001",
								 plotdensity = 10000)

plot!(plt, sol4, vars=(0,1), label="Tsit5 Default")

plot!(plt, sol5, vars=(0,1), label="Reference")

plt

You can click on the lines to turn them on and off. Tsit5 is hard to see since it's almost exactly on the reference

Work-Precision Diagrams

Work-precision diagrams are the central tool used for conveying the benchmark results of ODE solvers. They are the fundamental plot of many results in DiffEqBenchmarks.jl, the benchmarks of the DifferentialEquations.jl ecosystem. So let's explore the efficiency of Euler's method using work-precision diagrams.

Work-precision diagrams use the reference equation to calculate the error at the final time point and plots it against the amount of time it took to compute the solution. Let's make our ultra low tolerance Vern9 solution be our reference:

using DiffEqDevTools
test_sol = TestSolution(sol)
retcode: Success Interpolation: specialized 8th order lazy interpolation t: nothing u: nothing

By using a reference solution we can estimate the error or each solve and plot time against error. This fixes the issue that different methods take different amounts of steps and achieve different accuracy: by looking at a vertical in the plot you can simultaneously compare the timing of methods at the same accuracy. A line that is higher simply means that it takes longer to compute the solution for that amount of accuracy, while the slope of the line shows its time scaling behavior as you ask for more accuracy.

Let's build work-precision diagrams with the standard Runge-Kutta methods, and compare them to the optimized library codes. Our error estimator to start will be the error at the final time point. The following runs the solvers with various adaptive tolerances and dts to generate the diagram.

abstols = 1.0 ./ 10.0 .^(2:9); reltols = 1.0 ./ 10.0 .^(1.:8)

setups = [Dict(:alg=>Tsit5());Dict(:alg=>Vern9());Dict(:alg=>BS3());
          Dict(:alg=>Euler(),:dts=>1.0./2.0.^(1:length(reltols)));
          Dict(:alg=>RK4(),:adaptive=>false,:dts=>1.0./2.0.^(1:length(reltols)))]

wp = WorkPrecisionSet(prob,abstols,reltols,setups;
		 appxsol = test_sol,numruns=10,maxiters=100000)
plot(wp)

We get similar results if we also test the L2 error along the timeseries (the mean-squared error at 1000 evenly spaced points along the solution).

wp = WorkPrecisionSet(prob,abstols,reltols,setups;
appxsol = test_sol,numruns=10,maxiters=1000000,error_estimate=:L2,dense_errors=true)
plot(wp)

There's a general phonomena that is shown here. Higher order methods tend to be more efficient when you need less error. However, anything below 3rd order tends to generally be not useful, and for most problems it has been found that 5th order or above is the most efficient. The Euler method, being a first order discretization, converges far too slow and simply takes too many timesteps in order to bring the error down to satisfying levels. Even the "standard" RK4 method which is taught in a lot of classes is shown here to not be very efficient, at least in comparison to the methods commonly used in production-quality ODE solvers. Notice that due to the log scaling, Vern9 and Tsit5 are almost 10x faster that RK4 for get about 6 digits of accuracy! There's a lot of reasons for that, but you can browse Hairer's Solving Ordinary Differential Equations I for why different Runge-Kutta methods can be vastly different in efficiency (it has to do with choosing coefficients s.t. the principle truncation error term is minimized).

Conclusion

Whenever someone in 2019 uses more code to internally implement a quick and dirty solver instead of running standard codes, a baby sheds a tear. It's not just about error or speed (or even accuracy: libraries undergo lots of tests!), it's about how much error you get with a certain speed. Euler's method (and even RK4) just don't scale well if you need non-trivial error on most equations. Sure, it'll do well in some, but usually it's not that great. If you need your simulation to go faster, an easy fix is to just use a better method.

Instead write your code to work with a library, and play around with it to see what works, We can see that on a simple equation you can already get some pretty good speed gains, and when you go to harder equations like stiff ODEs the timing difference goes to >1000x due to stepsize restrictions for explicit solvers. And the extra features like adaptive timestepping, interpolations, event handling, etc. help out a lot too. Using these tools requires less code anyways, so it's generally a good idea. So yes, you should keep in mind the methods you learned in a numerical analysis class as the basis for what can be done, but recognize that those methods are generally not used because production ODE solvers do something quite a bit more complicated in order to efficiently be accurate.