A Crash Course in HomotopyContinuation.jl
1. Introduction
HomotopyContinuation.jl is a Julia package for numerically solving systems of polynomial equations like
In this Nextjournal notebook we will walk you through the basics of the package. Nextjournal makes it possible that you can try and play with the code. Just navigate to REMIX and log in.
1.1. How does it work?
HomotopyContinuation.jl is a Julia implementation of the following computational paradigm, called homotopy continuation. The basic idea is as follows: suppose that
is a system of polynomials. For computing the isolated solutions of
with a known zero
In the following, a path will always mean a solution path in the above sense.
1.2. Installation
For using HomotopyContinuation.jl you have to install Julia 1.1. You can download Julia at julialang.org (please check the platform specific instructions).
Once Julia is installed, one can install HomotopyContinuation.jl through the package manager.
import Pkg Pkg.add("HomotopyContinuation");
We recommend to also install the DynamicPolynomials package for handling multivariate polynomials.
Pkg.add("DynamicPolynomials");
1.3. A first example
The most important function of HomotopyContinuation.jl is the solve
function. For solving a system of equations f=0
one simply writes solve(f)
.
For example, solving
goes as follows
# load the package into the current Julia session using HomotopyContinuation # declare the variables x and y x y; # define f f = [x^2 + 2y, y^2 - 2] # solve f result = solve(f)
Note that the printed info tells us that 4 paths have been tracked. Those paths are the solutions paths from above that have been approximately followed.
1.4. Start systems
The code above started the following process in the background: first the system
together with its solutions
where
So, essentially the code above was a shortcut for the following.
x y f = [x^2 + 2y, y^2 - 2] g = [x^2 - 1, y^2 - 1] start_sols = [ [1,1], [1,-1], [-1,1], [-1,-1] ] result = solve(g, f, start_sols)
As one can see, the command solve(g, f, start_sols)
tracks the solutions in start_sols
from g
to f.
A theorem by Bezout says that the number of (complex) solutions of
In general, if solve(f)
constructs the start system
1.5. The output of solve(f)
solve(f)
result
is an array with four entries, one for each solution.
Let us inspect the first entry of result
:
result[1]
The meaning of those entries is as follows:
return_code: success
means that the computation was successful.solution
is the solution that was computed.accuracy
is an approximation ofwhere is the computed solution and is the true solution. residual
is the value of the euclidean norm of, where is the computed solution. condition_jacobian
is the condition number of the Jacobian ofat the solution. A small value of this indicates that the numerical values are trustworthy. path_number
the number of the path in the enumeration of all paths.
We can also get the solutions without any additional information:
solutions(result)
1.6. Real solutions
We get the real solutions from result
by
realsolutions(result)
Observe that realsolutions(result)
returns an array of real solutions, and not the PathResult
s containing the additional information.
We can get the PathResult
s of all real solutions by
real(result)
2. Groups of variables
Often, polynomial system have structure that one can exploit. The most basic structure is when variables appear in groups of different degrees. In this case the solution count by Bezout's theorem is too coarse, and by using the standard start system
2.1. Lagrange multipliers
As an example, consider the following optimization problem
The strategy to find the global optimum is to use the method of Lagrange multipliers to find all critical points of the objective function such that the equality constraint is satisfied. We start with defining our Lagrangian.
x y z J = 3x^3*y+y^2*z^2-2x*y-x*4z^3 g = x^4+y^4+z^4-1 # Introduce auxillary variable for Lagrangian λ # define Lagrangian L = J - λ * g;
In order to compute all critical points we have to solve the square system of equations
For this we first compute the gradient of solve
routine to find all critical points.
import DynamicPolynomials: differentiate ∇L = differentiate(L, [x, y, z, λ]) # Now we solve the polynomial system ∇L = 0 result = solve(∇L)
However, the system
result = solve(∇L, variable_groups = [(x,y,z), (λ,)], show_progress = false)
(show_progress
= false
is for not printing the progress bar).
We see that out of the 108 (complex) critical points there are only 22 which are real. In order to find the global minimum we can now evaluate all real solutions and find the value where the minimum is attained.
# Now we simply evaluate the objective J and find the minimum reals = realsolutions(result) minval, minindex = findmin(map(s -> J(s[1:3]), reals)) minarg = reals[minindex][1:3]
3. Parameter homotopies
In practice one often encounters families of polynomial systems
depending on
Instead of solving
(this is called a parameter homotopy, because it is a straight-line homotopy in the parameters). The parameters
3.1. The 6R inverse problem
Here is an example for using parameter homotopies: consider a robot that consists of 7 links connected by 6 joints. The first link is fixed and the last link has a "hand".
The problem of determining the position of the hand when knowing the arrangement of the joints is called forward problem.
The problem of determining any arrangement of joints that realized a fixed position of the hand is called backward problem.

Let us denote by
The
In this notation the forward problem consists of computing
Assume that
using HomotopyContinuation, LinearAlgebra, DynamicPolynomials # initialize the variables z[1:6,1:3] p[1:3] α = randn(5) a = randn(9) # define the system of polynomials f = [z[i,:] ⋅ z[i,:] for i = 2:5] g = [z[i,:] ⋅ z[i+1,:] for i = 1:5] h = sum(a[i] .* (z[i,:] × z[i+1,:]) for i=1:3) + sum(a[i+4] .* z[i,:] for i = 2:5) F′ = [f .- 1; g .- cos.(α); h .- p] # assign values to z₁ and z₆ z₁ = normalize!(randn(3)) z₆ = normalize!(randn(3)) F = [subs(f, z[1,:]=>z₁, z[6,:]=>z₆, p=>[1, 1, 0]) for f in F′];
Now we can just pass F
to solve
in order to compute all solutions.
solve(F, show_progress = false)
We find 16 solutions, which is the correct number of solutions.
However, a closer look reveals that the equations are bi-homogenous with respect to the variable groups
variable_groups=[[z[2,:]; z[4,:]], [z[3,:]; z[5,:]]]; solve(F; variable_groups=variable_groups, show_progress = false)
Now assume that we do not only want to know solve the inverse problem for one value of
Let's start the offline phase by computing a random complex instance.
p_rand = randn(ComplexF64, 3) F_rand = [subs(f, z[1,:]=>z₁, z[6,:]=>z₆, p=>p_rand) for f in F′] R_rand = solve(F_rand, variable_groups=variable_groups, show_progress=false)
Now we can start the online phase and solve for our specific value
F̂ = [subs(f, z[1,:]=>z₁, z[6,:]=>z₆) for f in F′] q = [2,3,4] solve(F̂, solutions(R_rand); parameters=p, start_parameters=p_rand, target_parameters=q)
And we obtain 16 new solutions, but this time we only needed to track 16 paths. For even more performance improvements you can take a look at our guide regarding the solution of many systems in a loop.
4. The monodromy method
An alternative to using the solve function is solving a polynomial system
Suppose
The general syntax for this is monodromy_solve(F, [x], u₀, parameters = u)
4.1. Method of Moments
Consider three Gaussian random variables
A mixture of the three random variables
The method of moments recovers
Since we have 8 unknowns, we expect to need at least 8 moments to recover
a[1:3] μ[1:3] σ²[1:3] m0 = a[1]+a[2]+a[3]; m1 = a[1]*μ[1]+a[2]*μ[2]+a[3]*μ[3]; m2 = a[1]*(μ[1]^2+σ²[1])+a[2]*(μ[2]^2+σ²[2])+a[3]*(μ[3]^2+σ²[3]); m3 = a[1]*(μ[1]^3+3*σ²[1]*μ[1])+a[2]*(μ[2]^3+3*σ²[2]*μ[2])+a[3]* (μ[3]^3+3*σ²[3]*μ[3]); m4 = a[1]*(μ[1]^4+6*σ²[1]*μ[1]^2+3*σ²[1]^2)+a[2]* (μ[2]^4+6*σ²[2]*μ[2]^2+3*σ²[2]^2)+ a[3]*(μ[3]^4+6*σ²[3]*μ[3]^2+3*σ²[3]^2); m5 = a[1]*(μ[1]^5+10*σ²[1]*μ[1]^3+15*μ[1]*σ²[1]^2)+ a[2]*(μ[2]^5+10*σ²[2]*μ[2]^3+15*μ[2]*σ²[2]^2)+ a[3]*(μ[3]^5+10*σ²[3]*μ[3]^3+15*μ[3]*σ²[3]^2); m6 = a[1]*(μ[1]^6+15*σ²[1]*μ[1]^4+45*μ[1]^2*σ²[1]^2+15*σ²[1]^3)+ a[2]*(μ[2]^6+15*σ²[2]*μ[2]^4+45*μ[2]^2*σ²[2]^2+15*σ²[2]^3)+ a[3]*(μ[3]^6+15*σ²[3]*μ[3]^4+45*μ[3]^2*σ²[3]^2+15*σ²[3]^3); m7 = a[1]*(μ[1]^7+21*σ²[1]*μ[1]^5+105*μ[1]^3*σ²[1]^2+105*μ[1]*σ²[1]^3)+ a[2]*(μ[2]^7+21*σ²[2]*μ[2]^5+105*μ[2]^3*σ²[2]^2+105*μ[2]*σ²[2]^3)+ a[3]*(μ[3]^7+21*σ²[3]*μ[3]^5+105*μ[3]^3*σ²[3]^2+105*μ[3]*σ²[3]^3); m8 = a[1]* (μ[1]^8+28*σ²[1]*μ[1]^6+210*μ[1]^4*σ²[1]^2+420*μ[1]^2*σ²[1]^3+105*σ²[1]^4)+ a[2]* (μ[2]^8+28*σ²[2]*μ[2]^6+210*μ[2]^4*σ²[2]^2+420*μ[2]^2*σ²[2]^3+105*σ²[2]^4)+ a[3]* (μ[3]^8+28*σ²[3]*μ[3]^6+210*μ[3]^4*σ²[3]^2+420*μ[3]^2*σ²[3]^3+105*σ²[3]^4) f = [m0, m1, m2, m3, m4, m5, m6, m7, m8]
Let us consider the following moments:
p₀ = [1; -1; 3; -5.5; 22.45; -50.75; 243.325; -635.725; 3420.7375]
Solving
bezout_number(f - p₀)
yet, Amendola, Faugere and Sturmfels showed that the number of (complex) solutions of the polynomial system
First, we generate a start solution:
a₀ = randn(ComplexF64, 3); a₀ = a₀ / sum(a₀) μ₀ = rand(ComplexF64, 3) σ²₀ = rand(ComplexF64, 3) start_sol = [a₀; μ₀; σ²₀] q₀ = [m([a; μ; σ²] => start_sol) for m in f]
which we can use as input data for the monodromy method for q₀
being the parameters. Here, it doesn't matter that complex solutions have no interpretation as parameters. They only serve to solve one specific instance of the polynomial system.
p[1:9] R = monodromy_solve(f - p, start_sol, q₀, parameters=p; target_solutions_count = 1350)
Now, we can track the solutions from q₀
to p₀
.
R2 = solve(f - p, solutions(R), parameters = p, start_parameters=q₀, target_parameters = p₀, show_progress = false)
Let us keep the real solutions for which the variances are positive:
all_real_sols = realsolutions(R2) true_real_solutions = filter(s -> all(s[7:9] .> 0), all_real_sols);
There are 12 of those, which come in groups of 6, because the symmetric group
S₃ = SymmetricGroup(3) relabeling = GroupActions(v -> map(p -> (v[1:3][p]..., v[4:6][p]..., v[7:9][p]...), S₃)) mults = multiplicities(true_real_solutions, group_action = relabeling)
Each of the two vectors represents a group orbit. For instance, the first vector in mults
contains all the indices i
, such that true_real_solutions[i]
is contained in the first orbit. Then,
i, j = mults[1][1], mults[2][1] true_real_solutions[i], true_real_solutions[j]
are the parameters of the two mixtures Gaussian that give our moments q₀
.
The group action can also be exploited in the monodromy method itself. We can only track one point per orbit this reducing the complexity of the computation:
R_with_group_action = monodromy_solve(f - p, start_sol, q₀, parameters=p, group_action = relabeling; target_solutions_count = 225)
The full 1350 solutions can be returned as
vcat([relabeling(s) for s in solutions(R_with_group_action)]...)
Then, we can proceed with the parameter homotopy as above.
5. What's more?
Now, you should have obtained a solid foundation for using HomotopyContinuation.jl. But if you are eager to learn more, check our website