How are finite elements, multigrid methods, ODE solvers, etc. all the same topic?
Teach a man to fish: we won't be going over pre-built domain specific PDE solvers, instead we will be going over the tools which are used to build PDE solvers.
While the basics of numerically solving PDEs is usually taught in mathematics courses, the way it is described is not suitable for high performance scientific computing. Instead, we will describe the field in a very practical "I want to compute things fast and accurately" style.
What is a PDE?
A partial differential equation (PDE) is a differential equation which has partial derivatives. Let's unpack that.
A differential equation describes a value (function) by how it changes. u' = f(u,p,t) gives you a solution u(t) by you inputing / describing its derivative. Scientific equations are encoded in differential equations since experiments uncovers laws about what happens when entities change.
A partial differential equation describes a value by how it changes in multiple directions: how it changes in the x vs y physical directions, and how it changes in time.
Thus spatial properties, like the heat in a 3D room at a given time, u(x,y,z,t) are described by physical equations which are PDEs. Values like how a drug is distributed throughout the liver, or how stress propogates through an airplane hull, are examples of phonomena described by PDEs.
You are interested in PDEs.
Part 1: Representations of PDEs as other mathematical problems: Learning by example: the Poisson Equation
This will either be an overview where I will reframe the most common method of solution, or your first walkthrough of a PDE solver!
The best way to solve a PDE is...
By converting it into another problem!
Generally, PDEs are converted into:
Linear systems: Ax = b find x.
Nonlinear systems: G(x) = 0 find x.
ODEs: u' = f(u,p,t), find u.
There are others:
S(tochastic) DEs: du = f(u,p,t)dt + g(u,p,t)dW, find mean(u).
... Yes experts, there are more but we will stick to the usual stuff.
Let's introduce some shorthand: ux=∂x∂u. The Poisson Equation is the PDE:
for x∈[0,1]. In one dimension:
b(x) is some known constant function (known as "the data"). Given the data (the second derivative), find u.
How do we solve this PDE?
First Choice: Computational Representation of a Function
First we have to choose how to computationally represent our continuous function u(x) by finitely many numbers. Let's start with the most basic way (and we'll revisit the others later!). Let Δx be some constant discretization size and let xi=iΔx. Then we represent our function u(x)≈ui=u(xi), i.e. we represent a continuous function by values on a grid (and we can assume some interpolation, like linear interpolation)
Central difference for 2nd derivative: uxx≈δux=Δxδ+ux−δ−ux
This gives the well-known central difference formula:
We just made two choices:
Represent our function by an evenly spaced grid of points
Represent the derivative by the central difference formula
Given these two choices, how can we re-write our PDE?
The Representation of Our PDE
Remember we want to solve Δu=bon x∈[0,1]
u(x)is now a vector of points ui=u(xi)
b(x)is now a vector of points bi=b(xi)$b_i = b(x_i)$
The second derivative is now the function ∂x∂2u(xi)=Δx2ui+1−2ui+ui−1
Thus we have a system of i equations:
What happens at i=0?
Translating back to ui=u(xi)=u(iΔx):
The last point is out of the domain! In order to solve this problem we have to impose boundary conditions. For example, let's add the following condition to our problem: u(0)=u(1)=0. Then the 0th point is determined: u0=0, and the 1st point is:
so we can solve it!
The Linear Representation of Our Derivative
Notice that if U=[u1u2⋮uN−1uN], then
This is a linear equation!
We know A, B is given to us, find U.
Let's walk through a concrete version with some Julia code now. Let's solve:
x=Δx:Δx:1-Δx# Solve only for the interior: the endpoints are known to be zero!
We solved the PDE Δu=b by transforming our functions into vectors of numbers U and B, transforming second derivative into a linear operator (a matrix) A, and solving AU=B using backslash.
Does this method generally apply? Pretty much.
Because derivatives are linear, when you discretize a function, the derivative operators become linear operators = matrices
Semilinear Poisson Equation
Now the right hand side is dependent on u! Let's choose the same discretization and the same representation of the derivatives. Then once again Δu=AU for the same matrix A. Now we get the equation:
Find the vector of U which satisfy this nonlinear system! If we redefine:
then we are looking for the vector U which causes G(U)=0.
tl;dr: Semilinear equations convert into nonlinear rootfinding problems
Semilinear Heat Equation
Discretize the function the same way as before. This once again makes uxx=AU. Thus letting Ut be the time derivative of each point in the vector, we get:
but since there's now only one coordinate, let the derivative be by time. Then we can write this as:
This is an ODE!
tl;dr: Time-dependent PDEs (can) convert into ODE problems!
Loose End: Higher Dimensions Do The Same Thing
Now let's look at Δu=uxx+uyy=b(x,y) on x∈[0,1],y∈[0,1].
In this case, you can let ui,j=u(xi,yi) where xi=iΔx and yi=iΔy. You can list out all of the ui,j into a vector U by lexicographic ordering:
for some A. Now solve AU=B
Part 1 Summary
To solve a PDE,
You choose a way to represent functions.
You choose a way to represent your derivative (and on the function representation, your derivative representation is a matrix!)
Then when you write out your PDE, you get one of the following problems:
Linear systems: Ax = b find x.
Nonlinear systems: G(x) = 0 find x.
ODEs: u' = f(u,p,t), find u.
Part 2: The Many Ways to Discretize
There are thus 4 types of packages in the PDE solver pipeline:
Packages with ways to represent functions as vectors of numbers and their derivatives as matrices
Packages which solve linear systems
Packages which solve nonlinear rootfinding problems
Packages which solve ODEs
In this part we will look at the many ways you can discretize a PDE.
Part 2.1: Packages to Represent Functions and Derivatives
There are four main ways to represet functions and derivatives as vectors:
Finite difference method (FDM): functions are represented on a grid. Packages: DiffEqOperators.jl (still developing)
Finite volume method (FVM): functions are represented by a discretization of its integral. Currently no strong generic package support.
Finite element method (FEM): functions are represented by a local basis. FEniCS.jl, JuliaFEM and JuAFEM.jl
Spectral methods: functions are represented by a global basis. FFTW.jl and ApproxFun.jl
Finite Difference method: DiffEqOperators.jl
DiffEqOperators.jl is part of the JuliaDiffEq ecosystem. It automatically develops lazy operators for finite difference discretizations (functions represented on a grid). For example, to represent uxx, we'd do:
This A is lazy: it acts A*u like it was a matrix but without ever building the matrix by overloading * and directly computing the coefficients. This makes it efficient, using O(1) memory while not having the overhead of sparse matrices!
This package also makes it easy to generate the matrices without much work. For example, let's get a 2nd order discretization of uxxxx:
Brief brief brief overview of finite element methods
Represent your function u(x)=∑iciφi(x) with some chosen basis φi(x).
"Matrix Assembly" = calculate the matrix representations of the derivatives in this function representation. The core of an FEM package is its matrix assembly tools.
Finite difference method is good if your domain is a square / hypercube. Finite element methods can solve PDEs on more complicated domains.
FEM Package 1: FEniCS.jl
FEniCS is a well-known finite element package for Python (fellow NumFOCUS project!). It lets you describe the kind of PDE you want to solve and what elements (basis functions) you want to discretize with in a DSL, and it handles the rest.
FEniCS.jl is a wrapper over FEniCS which is maintained by the JuliaDiffEq organization.
Pro: very full featured (since it's wrapping an existing package). Linear solvers are built in.
Con: not Julia-based, so it's missing a lot of the fancy Julia features (generic programming, arbitrary number types, etc.). Python loading is tricky.
FEM Package 2: JuliaFEM
JuliaFEM is an organization with a suite of packages for performing finite element discretizations. It focuses on FEM discretizations of physical PDEs and integrates with Julia's linear solver, nonlinear rootfinding, and DifferentialEquations.jl libraries to ease the full PDE solving process.
FEM Package 3: JuAFEM.jl
JuAFEM.jl is a FEM toolbox. It gives you functionality that makes it easier to write matrix assembly routines. The below is the heat equation example.
Like finite element methods, spectral methods represent a function in a basis:u(x)=∑iciφi(x) with some chosen basis φi(x)
"Spectral" usually refers to global basis functions. For example, the Fourier basis of sines and cosines.
Spectral Packages 1: FFTW.jl
Okay, this is not necessarily a spectral discretization package. However, it is a package to change from a pointwise representation of a function to a Fourier representation via a Fast Fourier Transform (FFT). Thus if you want to find the coefficients cki for
This corresponds to basically saying that sin(x) is represented in the Fourier basis as:
It almost got it, and you can see the slight discretization error. This goes away as you add more points. But now you can represent any periodic function!
Spectral Packages 2: ApproxFun.jl
ApproxFun.jl is a package for easily approximating functions and their derivatives in a given basis, making it an ideal package for building spectral discretizations of PDEs. It utilizes lazy representations of infinite matrices to be efficient and save memory. It uses a type system to make the same code easy to translate between different basis choices.
Representing a Function and its Derivative in the Fourier Basis
in the Fourier basis, and build a matrix representation of the second derivative in this basis:
Using these packages, you can easily translate your PDE functions to coefficient vectors and your derivatives to matrices:
Use spectral methods or finite difference methods for cases with "simple enough" boundary conditions and on simple (square) domains
Use finite element packages to discretize complex spatial domains
There are a lot more factors. Sometimes given discretizations are better/worse on given PDEs. There are mathematicians who spend their entire life researching the differences between discretization methods!
Our PDE functions are now vectors of coefficients. Our PDE derivatives are now matrices. We are left with one of the following to solve:
Linear systems: Ax = b find x.
Nonlinear systems: G(x) = 0 find x.
ODEs: u' = f(u,p,t), find u.
Each of these problems is a specific area of research in and of itself!
Part 3: Equation Solving
In order to do equation solving correctly, we must make note of one important fact:
The matrices which come out of a derivative discretization are often very sparse!
The finite difference matrix [1 -2 1] only has 3 non-zero values per row no matter what N is! This type of matrix is called tridiagonal.
For matrices which have more general semi-diagonal structures which are only non-zero near the center, they are called banded matrices.
Derivatives in Fourier space are diagonal!
Generic sparse matrices arise from FEM discretizations (since the derivatives only depend on nearby basis elements which are very few!).
Thus solving PDEs is more specifically equation solving with large numbers of equations and high sparsity with structure
Part 3.1: Solving Large Sparse Linear Systems
When solving AU=B where A arises from the derivative term in the PDE discretization, the structure of A determines what type of method and what pacakage to use.
If A is tridiagonal, then Julia's \ will use a specialized fast method.
If A is small enough, then \ is a multithreaded LU or QR factorization which is fine.
If A is a banded matrix (many spectral discertizations and finite difference methods), then uses BandedMatrix(A)\B from BandedMatrices.jl utilizes a fast LU/QR for this matrix type.
If A is a block banded matrix (blocking arises from lexicographical ordering when 2+ dimensional), then BlockBandedMatrix(A)\B from BlockBandedMatrices.jl utilizes a fast LU/QR for this matrix type.
If A is a small enough sparse matrix, then Julia's Base \ will use SuperLUMT, a multithreaded sparse LU which will be fast if it has enough memory (hence small enough).
If A is a large sparse matrix (or matrix-free), then iterative methods are required. For this, there is IterativeSolvers.jl and JuliaSmoothOptimizers/Krylov.jl
Iterative solvers repeatedly calculate A*U \ k in order to find a sequence such that Uk→UwhereAU=B.
Because iterative solvers only use multiplication, "matrix-free" operators, i.e. Julia types which just define *, can be used in these methods!
IterativeSolvers.jl has many methods which are specialized for different forms of sparsity structures. For example, cg is a fast method for symmetric positive definite matrices. The fallback method for a general A is gmres. Let's give it a try on some random AU=B problem.
Let's create a type that is the second order central difference derivative [1 -2 1] and solve the Poisson equation Δu=b with our matrix-free operator by using gmres instead of \. We could just use DiffEqOperators.jl's lazy matrix here, but let's build the full example from scratch:
LinearMaps.jl is a nice package for helping build matrix-free discretizations!
To solve the linear system AU=B faster, you can use a preconditioners to partial solve the linear problem and thereby present a simpler problem to the iterative method. This is a deep deep area of research and the best preconditioner for your problem is highly dependent on your PDE, so I'm just going to point to some of the best packages in the area:
These packages allow you to distibute a linear solve computation amongst a whole cluster mixed with GPUs! But fundamentally the techniques are the same.
Part 3.1 Recap
Linear solving requires specializing on the matrix type:
If small enough or the structure is known, special methods should be used
If large enough or no sparsity structure to explot, Krylov methods like gmres need to be used.
If iterative (Krylov) methods are used, then a good preconditioner can heavily speed up the solving.
\, Julia's LinearAlgebra special matrix types, BandedMatrices.jl, BlockBandedMatrices.jl, and IterativeSolvers.jl are your friends!
Part 3.2 Nonlinear Solvers
Recall that if your data is a nonlinear function of your unknown then your PDE discretization doesn't produce a linear system but instead a nonlinear system G(x)=0. There are a few packages for this case:
These have not been benchmarked against each other ( https://github.com/JuliaNLSolvers/NLsolve.jl/issues/159 go do it!) but have different characteristics.
But due to features, KINSOL.jl is the recommended one for now for PDEs, with NLsolve features coming soon
Quick Note: Don't use optimization routines to solve rootfinding problems
G(x)=0 is theoretically the same as finding the minimum of ∥G(x)∥ (which should be zero!). However, this is bad:
Optimization methods are generally rootfinding methods on the derivative, since an optima is found when the derivative is zero!
Optimization methods generally use higher order derivatives. For example, Newton's method for optimization uses the second derivative (the Hessian), instead of the first derivative (the Jacobian), so it's more error prone and more expensive.
Just don't do it.
where J(zn) is the Jacobian of G(zn). To solve this without needed inverses, you can rewrite this with dz=zn+1−zn and see that
Thus every step of Newton's method is actually a linear solving problem, where the matrix is the Jacobian of G. Since G is from our discretized PDE and all of the nonlinear behavior is local, the Jacobian of G has the same structure and sparsity pattern as the A from before! This means that:
Newton's method is essentially nonlinear solving via repeat linear solving. Almost all of the time is spent linear solving
All of the same tricks from before apply: we want to specialize on banded matrices, we want to use Krylov methods, and preconditioners, etc.
Part 3.2 Recap
To solve nonlinear systems G(x)=0, you either do:
Some simple iteration scheme (Picard iteration, Powell iterations, Anderson acceleration). These have a smaller area of convergence and converge slower, but do not require solving linear systems.
Some form of Newton's method. This requires solving linear systems.
Usually a Newton method is required. In this case, almost all of the cost comes from solving the linear systems. Right now, KINSOL in Sundials.jl is the best tool because of the flexibility it gives in the linear solver choices, though NLsolve.jl will be adding similar features shortly (PRs are in progress).
Part 3.3 ODE Solvers
Recall that if you leave a derivative un-discretized you get an ODE system. Example:
You could discretize this time derivative the same way as you did space, getting a two-dimensional system and solving a big linear/nonlinear equation. But there can be advantages to leaving it as an ODE:
The full linear/nonlinear system is much larger. If you have N points in space and M time points, then you have u as a size NM vector! This grows fast!
Newton's method requires a good starting point. For a huge system in time and space, initial values for u(xi,tj) can be hard to come by, and Newton's method can diverge.
Keeping one part undiscretized and solving the ODE can alieviate these problems.
Features of a PDE-derived ODE
A PDE-derived ODE tends to have the following features:
Stiffness. This is due to the CFL condition which does require ΔxkΔt<1for stability
Large sparse Jacobian. This is because of the matrices from the derivative terms!
What ODE methods are applicable to this case?
DifferentialEquations.jl has a wealth of unique options specifically for reducing the cost of integration on these types of problems.
Quick Note: Order of Integrator
FAQ: I did a second order discretization in space, does it make sense to use a higher than second order integrator in time?
Answer: Yes! The order of the integrator isn't just for reducing error. Higher order integrators can be more efficient! You can always incrase the error by taking larger time steps (increasing the tolernace) and thereby use less steps with a more efficient method.
Implicit and Rosenbrock ODE Integrators
These methods step by repeatedly solving nonlinear systems. Remember that solving nonlinear systems boils down to repeatedly solving linear systems. Thus for the ODE u′=f(u,p,t) with Jacobian J, an implicit ODE solver spends most of its time solving:
Since this J has the same characteristics as before, solving this efficiently requires specializing the linear solver to the sparsity pattern (bandedness, etc.) or using a good preconditioner in an iterative solver.
Good choices in DifferentialEquations.jl:
Example: Spectral time stepping for Heat Equation in Fourier Space
Pseudospectral Solving the Reaction-Diffusion Equation
Take the PDE
For some nonlinear u. In the Fourier basis, uxx=Au for a diagonal matrix u, so this is a nice way to solve the Heat Equation as above (by solving in Fourier space, and converting the solution back). We can solve a nonlinear equation by, whenever we need to apply the nonlinear f, transform u back to the origional coordinates and applying f on u before transforming back. This pseudospectral discretization is seen as:
gives an ODE for how Fu evolves, and from that we can recover the true solution via the inverse Fourier transform F−1(Fu)
Implicit-Explicit (IMEX) Integrators
But wait, we shouldn't solve this as one whole ODE. If we split the ODE into two parts:
then we notice that the first part is the linear part and has the stiff term, while the second part is nonlinear and (can be) non-stiff. If we only are implicit on the first part, we can use a linear solver instead of a nonlinear solver to significantly reduce the amount of work! Integrators which allow you to split and do two separate methods at the same time are called IMEX integrators.
In DifferentialEquations.jl, IMEX integrators to be aware of are SplitODEProblem
# By making it Diagonal, DiffEq internally specializes the linear solver
A more recent set of integrators are the exponential integrators. These methods do not require solving a linear system, thus avoiding the most costly calculation of the implicit methods. However, they must perform a Krylov version of matrix exponential times vector multiplications, w=exp(γA)v. Recent literature has been showing large performance gains of exponential integrators over traditional implicit methods. DifferentialEquations.jl is the first open source library in a high performance language to include exponential integrators, and a brand new (adaptive) Krylov-based approach for large sparse systems was just released as part of GSoC 2018. These methods will be continued to be improved but are ready for widespread use and benchmarking is to come.
Additional methods in DifferentialEquations.jl should be explored. The Runge-Kutta Chebyshev (RKC) methods are stabilized explicit methods which can solve stiff equations without linear solving. Options allow for swapping out the internal Newton method to Picard and Anderson iteration to not require linear solving. Passing a linsolve allows you to take control over the linear solver technique: make it an AMG-preconditioned GMRES. Please see the documentation for a full feature list.
Strong-Stability Presurving Runge-Kutta methods are "more stable" explicit methods for PDEs. Some PDEs, like hyperbolic PDEs, are not amenable to implicit solvers and thus require explicit solvers. SSPRK methods can increase the allowable stepsize to increase the efficiency!
Part 3.3 Recap
PDEs can also be solved by leaving an axis undiscretized and using an ODE solver. This can increase efficiency, stability, and reduce memory consumption. Classes of methods to utilize are:
Implicit and Rosenbrock(-W) Methods
Runge-Kutta Chebyshev Methods
IterativeSolvers.jl is compatible with matrix-free types via * overloads able to be used in the nonlinear and ODE solvers
BandedMatrices.jl and BlockBandedMatrices.jl are packages for specialization of linear solvers on common PDE matrix types
KINSOL from Sundials.jl, and soon NLsolve.jl.
Julia is a language which allows for zero-cost abstractions, letting one compile code specific to the application and compose packages without extra overhead
The native Julia methods of DifferentialEquations.jl also let you swap in all of the Julia tools for linear solvers and allows type-genericity (which allows the use of GPUs for example), allowing you to utilize the full arsenal of tools with these unique implementations.
Conclusion: Solving PDEs Takes Layers of Tools
You need discretization tooling, linear solvers, nonlinear solvers, and finally ODE solvers to build an efficient PDE solver.
We showed how you can use simple loops to write simple PDE solvers, but the efficient methods require using packages in order to get the latest and most efficient methods.
Many of the latest and most efficient methods only have implementations in Julia, and the JuliaDiffEq organization is committed to continuing the development of such toolling.
In the near future we with to provide a set of problem types and algorithms for common PDEs. Example: Reaction-Diffusion Equation
I am deeply indebted to every JuliaDiffEq contributor. I would like to especially those who have specifically been involved in developing and funding the large range of tools which have been demonstrated in this talk: