by Sascha TimmeMay 06 2019 UTC

A Crash Course in HomotopyContinuation.jl

1. Introduction

HomotopyContinuation.jl is a Julia package for numerically solving systems of polynomial equations like.

In this Nextjournal notebook we will walk you through the basics of the package. Nextjournal makes it possible that you can try and play with the code. Just navigate to REMIX and log in.

1.1. How does it work?

HomotopyContinuation.jl is a Julia implementation of the following computational paradigm, called homotopy continuation. The basic idea is as follows: suppose that

f(x1,,xn)=[f1(x1,,xn)fm(x1,,xn)].f(x_1,\ldots,x_n) = \begin{bmatrix} f_1(x_1,\ldots,x_n) \\ \vdots \\ f_m(x_1,\ldots,x_n) \end{bmatrix}.

is a system of polynomials. For computing the isolated solutions of one takes another system

g(x1,,xn)=[g1(x1,,xn)gm(x1,,xn)].g(x_1,\ldots,x_n) = \begin{bmatrix} g_1(x_1,\ldots,x_n) \\ \vdots \\ g_m(x_1,\ldots,x_n) \end{bmatrix}.

with a known zeroIn the space of polynomial systems and are connected by a path with and (also called a homotopy, hence the name). The idea in homotopy continuation is to approximately follow the solution path defined by For this, the path is discretized into time steps If the discretization is fine enough then the Newton operator of applied to converges to a zero of . Once this zero is approximated well enough, we may repeat the procedure for and so on. In the end, we will find an approximate zero for the system .

In the following, a path will always mean a solution path in the above sense.

1.2. Installation

For using HomotopyContinuation.jl you have to install Julia 1.1. You can download Julia at julialang.org (please check the platform specific instructions).

Once Julia is installed, one can install HomotopyContinuation.jl through the package manager.

import Pkg
Pkg.add("HomotopyContinuation");

We recommend to also install the DynamicPolynomials package for handling multivariate polynomials.

Pkg.add("DynamicPolynomials");

1.3. A first example

The most important function of HomotopyContinuation.jl is the solve function. For solving a system of equations f=0 one simply writes solve(f).

For example, solving with

f=[x2+2yy22]f=\begin{bmatrix}x^2+2y \\ y^2-2 \end{bmatrix}

goes as follows

# load the package into the current Julia session
using HomotopyContinuation 

# declare the variables x and y
@polyvar x y; 

# define f
f = [x^2 + 2y, y^2 - 2]

# solve f
result = solve(f) 
Result with 4 solutions ================================== • 4 non-singular solutions (2 real) • 0 singular solutions (0 real) • 4 paths tracked • random seed: 660479

Note that the printed info tells us that 4 paths have been tracked. Those paths are the solutions paths from above that have been approximately followed.

1.4. Start systems

The code above started the following process in the background: first the system

g(x,y)=[x21y21]g(x,y) = \begin{bmatrix} x^2 - 1\\ y^2 - 1\end{bmatrix}

together with its solutions was initialized. Then, each of the solutions was tracked towards along the homotopy

H((x,y),t)=γtg(x,y)+(1t)f(x,y),H((x,y),t) = \gamma \, t\, g(x,y) + (1-t) \,f(x,y),

where is a random complex number (such a homotopy is called a straight-line homotopy, and using increases the stability of the computation).

So, essentially the code above was a shortcut for the following.

@polyvar x y

f = [x^2 + 2y, y^2 - 2]
g = [x^2 - 1, y^2 - 1]

start_sols = [
  [1,1], [1,-1], [-1,1], [-1,-1]
]

result = solve(g, f, start_sols) 
Result with 4 solutions ================================== • 4 non-singular solutions (2 real) • 0 singular solutions (0 real) • 4 paths tracked • random seed: 282418

As one can see, the command solve(g, f, start_sols) tracks the solutions in start_sols from g to f.

A theorem by Bezout says that the number of (complex) solutions of is at most four. Hence, by tracking the four paths corresponding to the four start solutions, we get all solutions of .

In general, if is a system of equations, in which the -th equation has degree , then solve(f) constructs the start system .

1.5. The output of solve(f)

result is an array with four entries, one for each solution.

Let us inspect the first entry of result:

result[1]
PathResult ================= • return_code: success • solution: Complex{Float64}[-2.51712e-32-1.68179im, 1.41421-3.69779e-32im] • accuracy: 3.029e-16 • residual: 4.441e-16 • condition_jacobian: 4.655e+00 • path_number: 1

The meaning of those entries is as follows:

  • return_code: success means that the computation was successful.

  • solution is the solution that was computed.

  • accuracy is an approximation of where is the computed solution and is the true solution.
  • residual is the value of the euclidean norm of , where is the computed solution.
  • condition_jacobian is the condition number of the Jacobian of at the solution. A small value of this indicates that the numerical values are trustworthy.
  • path_number the number of the path in the enumeration of all paths.

We can also get the solutions without any additional information:

solutions(result)
4-element Array{Array{Complex{Float64},1},1}: [-2.51712e-32-1.68179im, 1.41421-3.69779e-32im] [1.68179+1.54074e-33im, -1.41421-1.54074e-33im] [2.51712e-32+1.68179im, 1.41421-3.69779e-32im] [-1.68179-1.54074e-33im, -1.41421-1.54074e-33im]

1.6. Real solutions

We get the real solutions from result by

realsolutions(result)
2-element Array{Array{Float64,1},1}: [1.68179, -1.41421] [-1.68179, -1.41421]

Observe that realsolutions(result) returns an array of real solutions, and not the PathResults containing the additional information.

We can get the PathResults of all real solutions by

real(result)
2-element Array{PathResult{Array{Complex{Float64},1}},1}: • return_code: success • solution: Complex{Float64}[1.68179+1.54074e-33im, -1.41421-1.54074e-33im] • accuracy: 1.464e-16 • path_number: 2 • return_code: success • solution: Complex{Float64}[-1.68179-1.54074e-33im, -1.41421-1.54074e-33im] • accuracy: 1.464e-16 • path_number: 4

2. Groups of variables

Often, polynomial system have structure that one can exploit. The most basic structure is when variables appear in groups of different degrees. In this case the solution count by Bezout's theorem is too coarse, and by using the standard start system one might end up tracking way to many paths. Instead, we can use a theorem by Shafarevich for constructing more efficient homotopies.

2.1. Lagrange multipliers

As an example, consider the following optimization problem

minimize  3x3y+y2z22xy4xz3s.t.x4+y4+z4=1\text{minimize} \; 3x^3y+y^2z^2-2xy-4xz^3 \quad \text{s.t.} \quad x^4+y^4+z^4=1

The strategy to find the global optimum is to use the method of Lagrange multipliers to find all critical points of the objective function such that the equality constraint is satisfied. We start with defining our Lagrangian.

@polyvar x y z
J = 3x^3*y+y^2*z^2-2x*y-x*4z^3
g = x^4+y^4+z^4-1

# Introduce auxillary variable for Lagrangian
@polyvar λ

# define Lagrangian
L = J - λ * g;

In order to compute all critical points we have to solve the square system of equations

(x,y.z,λ)L=0\nabla_{(x,y.z,\lambda)}L = 0

For this we first compute the gradient of and then use the solve routine to find all critical points.

import DynamicPolynomials: differentiate
∇L = differentiate(L, [x, y, z, λ])

# Now we solve the polynomial system ∇L = 0
result = solve(∇L)
Result with 108 solutions ================================== • 108 non-singular solutions (22 real) • 0 singular solutions (0 real) • 256 paths tracked • random seed: 39564

However, the system is of degree 4 in and degree 1 in . We can track fewer paths, if we declare this structure!

result = solve(∇L, variable_groups = [(x,y,z), (λ,)], show_progress = false)
Result with 108 solutions ================================== • 108 non-singular solutions (22 real) • 0 singular solutions (0 real) • 108 paths tracked • random seed: 582878

(show_progress = false is for not printing the progress bar).

We see that out of the 108 (complex) critical points there are only 22 which are real. In order to find the global minimum we can now evaluate all real solutions and find the value where the minimum is attained.

# Now we simply evaluate the objective J and find the minimum
reals = realsolutions(result)
minval, minindex = findmin(map(s -> J(s[1:3]), reals))
minarg = reals[minindex][1:3]
3-element Array{Float64,1}: 0.6893448348668392 0.19072105130305433 0.9376180557378104

3. Parameter homotopies

In practice one often encounters families of polynomial systems

P={f(x,p)=(f1(x,p),,fn(x,p))pCm}.\mathcal{P} = \{f(x,p) = (f_1(x,p), \ldots, f_n(x,p)) \mid p \in \mathbb{C}^m\}.

depending on parameters, and the task of solving many instances for different parameters .

Instead of solving for every parameter we are interested in from scratch, it is more efficient to solve first a fixed system and then use the solutions of this as start solutions in the homotopy

H(x,t):=F(x,(1t)p+tq)H(x,t) := F(x, (1-t)p + tq)

(this is called a parameter homotopy, because it is a straight-line homotopy in the parameters). The parameters should be chosen randomly from the complex numbers, even if complex parameters have no immediate interpretation for the problem we are considering. The just serve as starting values!

3.1. The 6R inverse problem

Here is an example for using parameter homotopies: consider a robot that consists of 7 links connected by 6 joints. The first link is fixed and the last link has a "hand".

The problem of determining the position of the hand when knowing the arrangement of the joints is called forward problem.

The problem of determining any arrangement of joints that realized a fixed position of the hand is called backward problem.

Let us denote by the unit vectors that point in the direction of the joint axes. They satisfy the following polynomial equations in the variables and and .

z1z1=1z6z6=1,z1z2=cosα1,z5z6=cosα6,p=a1(z1×z2)++a5(z5×z6)+a6z2++a9z5.\begin{aligned} z_1 \cdot z_1 &= 1\\ \vdots\\ z_6 \cdot z_6 &= 1,\\ z_1\cdot z_{2} &= \cos \alpha_1,\\ \vdots\\ z_5\cdot z_{6} &= \cos \alpha_6,\\ p & = a_1 (z_1 \times z_2) + \cdots + a_5 (z_5 \times z_6) + a_6 z_2 + \cdots + a_9 z_5. \end{aligned}

The are the "twist angle" between joints, the are the "link length" between joint axes and the encodes the position of the hand. is the cross product in .

In this notation the forward problem consists of computing given the and and the backward problem consists of computing that realize some fixed (knowing and means that the position where the robot is attached to the ground and the position where its hand should be are fixed).

Assume that and are random unit vectors, and some random and . We compute all backward solutions. We start with setting up the system.

using HomotopyContinuation, LinearAlgebra, DynamicPolynomials

# initialize the variables
@polyvar z[1:6,1:3] p[1:3]
α = randn(5)
a = randn(9)

# define the system of polynomials
f = [z[i,:]  z[i,:] for i = 2:5]
g = [z[i,:]  z[i+1,:] for i = 1:5]
h = sum(a[i] .* (z[i,:] × z[i+1,:]) for i=1:3) +
    sum(a[i+4] .* z[i,:] for i = 2:5)
F′ = [f .- 1; g .- cos.(α); h .- p]

# assign values to z₁ and z₆
z₁ = normalize!(randn(3))
z₆ = normalize!(randn(3))
F = [subs(f, z[1,:]=>z₁, z[6,:]=>z₆, p=>[1, 1, 0]) for f in F′];

Now we can just pass F to solve in order to compute all solutions.

solve(F, show_progress = false)
Result with 16 solutions ================================== • 16 non-singular solutions (2 real) • 0 singular solutions (0 real) • 1024 paths tracked • random seed: 747026

We find 16 solutions, which is the correct number of solutions.

However, a closer look reveals that the equations are bi-homogenous with respect to the variable groups and . As before, we can exploit this fact to solve the system more efficiently

variable_groups=[[z[2,:]; z[4,:]], [z[3,:]; z[5,:]]];
solve(F; variable_groups=variable_groups, show_progress = false)
Result with 16 solutions ================================== • 16 non-singular solutions (2 real) • 0 singular solutions (0 real) • 320 paths tracked • random seed: 560122

Now assume that we do not only want to know solve the inverse problem for one value of but rather for many different positions of the hand. Instead of solving the system from scratch every time (and tracking 320 paths) we can first compute a set of 16 solutions with respect to a random complex set of parameters and then use these start solutions to compute the solutions for the specific parameters we are interested in. The first step is what we call the offline phase. The second step is the online phase.

Let's start the offline phase by computing a random complex instance.

p_rand = randn(ComplexF64, 3)
F_rand = [subs(f, z[1,:]=>z₁, z[6,:]=>z₆, p=>p_rand) for f in F′]
R_rand = solve(F_rand, variable_groups=variable_groups, show_progress=false)
Result with 16 solutions ================================== • 16 non-singular solutions (0 real) • 0 singular solutions (0 real) • 320 paths tracked • random seed: 390076

Now we can start the online phase and solve for our specific value :

 = [subs(f, z[1,:]=>z₁, z[6,:]=>z₆) for f in F′]
q = [2,3,4]
solve(, solutions(R_rand); parameters=p, start_parameters=p_rand, target_parameters=q)
Result with 16 solutions ================================== • 16 non-singular solutions (0 real) • 0 singular solutions (0 real) • 16 paths tracked • random seed: 62156

And we obtain 16 new solutions, but this time we only needed to track 16 paths. For even more performance improvements you can take a look at our guide regarding the solution of many systems in a loop.

4. The monodromy method

An alternative to using the solve function is solving a polynomial system by monodromy. This approach requires the user to provide at least one solution of . Here is the basic idea:

Suppose is a solution and that is a point in a family of polynomial systems , which is defined with parameters. The monodromy method consists of moving around in a loop starting and ending at the parameter while tracking along that loop. After one iteration usually one has found a new solution . This process is then repeated until some stopping criterion is fulfilled.

The general syntax for this is monodromy_solve(F, [x], u₀, parameters = u)

4.1. Method of Moments

Consider three Gaussian random variables with means and variances . The density of is

ϕi(x)=12π  e(xμi)22σi2.\phi_i(x) = \frac{1}{\sqrt{2\pi}}\; e^{-\frac{(x-\mu_i)^2}{2\sigma_i^2}}.

A mixture of the three random variables is the random variable with density

ψ(x)=a1ϕ1(x)+a2ϕ2(x)+a3ϕ3(x), for a1+a2+a3=1.\psi(x) = a_1\, \phi_1(x)\, + \,a_2 \,\phi_2(x)\, + \,a_3 \,\phi_3(x), \quad\text{ for } \quad a_1+a_2+a_3 =1.

The method of moments recoversfrom the moments

mk=xkψ(x)dx.m_k = \int_{-\infty}^\infty x^k \, \psi(x)\, \mathrm{d}x.

Since we have 8 unknowns, we expect to need at least 8 moments to recover. Let us set up a system for this in Julia.

@polyvar a[1:3] μ[1:3] σ²[1:3]

m0 = a[1]+a[2]+a[3];
m1 = a[1]*μ[1]+a[2]*μ[2]+a[3]*μ[3];
m2 = a[1]*(μ[1]^2+σ²[1])+a[2]*(μ[2]^2+σ²[2])+a[3]*(μ[3]^2+σ²[3]);
m3 = a[1]*(μ[1]^3+3*σ²[1]*μ[1])+a[2]*(μ[2]^3+3*σ²[2]*μ[2])+a[3]*
			(μ[3]^3+3*σ²[3]*μ[3]);
m4 = a[1]*(μ[1]^4+6*σ²[1]*μ[1]^2+3*σ²[1]^2)+a[2]*
			(μ[2]^4+6*σ²[2]*μ[2]^2+3*σ²[2]^2)+
      a[3]*(μ[3]^4+6*σ²[3]*μ[3]^2+3*σ²[3]^2);
m5 = a[1]*(μ[1]^5+10*σ²[1]*μ[1]^3+15*μ[1]*σ²[1]^2)+
      a[2]*(μ[2]^5+10*σ²[2]*μ[2]^3+15*μ[2]*σ²[2]^2)+
      a[3]*(μ[3]^5+10*σ²[3]*μ[3]^3+15*μ[3]*σ²[3]^2);
m6 = a[1]*(μ[1]^6+15*σ²[1]*μ[1]^4+45*μ[1]^2*σ²[1]^2+15*σ²[1]^3)+
      a[2]*(μ[2]^6+15*σ²[2]*μ[2]^4+45*μ[2]^2*σ²[2]^2+15*σ²[2]^3)+
      a[3]*(μ[3]^6+15*σ²[3]*μ[3]^4+45*μ[3]^2*σ²[3]^2+15*σ²[3]^3);
m7 = a[1]*(μ[1]^7+21*σ²[1]*μ[1]^5+105*μ[1]^3*σ²[1]^2+105*μ[1]*σ²[1]^3)+
      a[2]*(μ[2]^7+21*σ²[2]*μ[2]^5+105*μ[2]^3*σ²[2]^2+105*μ[2]*σ²[2]^3)+
      a[3]*(μ[3]^7+21*σ²[3]*μ[3]^5+105*μ[3]^3*σ²[3]^2+105*μ[3]*σ²[3]^3);
m8 = a[1]*
			(μ[1]^8+28*σ²[1]*μ[1]^6+210*μ[1]^4*σ²[1]^2+420*μ[1]^2*σ²[1]^3+105*σ²[1]^4)+
      a[2]*
			(μ[2]^8+28*σ²[2]*μ[2]^6+210*μ[2]^4*σ²[2]^2+420*μ[2]^2*σ²[2]^3+105*σ²[2]^4)+
      a[3]*
			(μ[3]^8+28*σ²[3]*μ[3]^6+210*μ[3]^4*σ²[3]^2+420*μ[3]^2*σ²[3]^3+105*σ²[3]^4)

f = [m0, m1, m2, m3, m4, m5, m6, m7, m8]

Let us consider the following moments:

p₀ =  [1; -1; 3; -5.5; 22.45; -50.75; 243.325; -635.725; 3420.7375]
9-element Array{Float64,1}: 1.0 -1.0 3.0 -5.5 22.45 -50.75 243.325 -635.725 3420.7375

Solving directly is difficult. The total number of paths is

bezout_number(f - p₀)
362880

yet, Amendola, Faugere and Sturmfels showed that the number of (complex) solutions of the polynomial system is 1350 (much less than 362880!). Instead of solving directly, we use monodromy for solving another instance . Then we move the 1350 computed solutions from to .

First, we generate a start solution:

a₀ = randn(ComplexF64, 3); a₀ = a₀ / sum(a₀)
μ₀ = rand(ComplexF64, 3)
σ²₀ = rand(ComplexF64, 3)
start_sol = [a₀; μ₀; σ²₀]

q₀ = [m([a; μ; σ²] => start_sol) for m in f]
9-element Array{Complex{Float64},1}: 0.9999999999999998 + 2.7755575615628914e-17im -0.16651460606110507 + 0.6202123490265717im -0.6837422792748448 + 0.1576463596179007im -1.1620634373631347 - 1.2340206502702429im 0.9066660227341531 - 2.7713922760504213im 3.6729385200431617 - 2.9246877664207243im 10.160498898249722 - 0.23890403836098034im 13.644591697970252 + 10.0855790446672im 17.207333055258218 + 26.955317643042594im

which we can use as input data for the monodromy method for q₀ being the parameters. Here, it doesn't matter that complex solutions have no interpretation as parameters. They only serve to solve one specific instance of the polynomial system.

@polyvar p[1:9]
R = monodromy_solve(f - p, start_sol, q₀, parameters=p; 
  									target_solutions_count = 1350)
MonodromyResult ================================== • 1350 solutions (0 real) • return code → success • 8286 tracked paths

Now, we can track the solutions from q₀ to p₀.

R2 = solve(f - p, solutions(R), parameters = p, 
  				 start_parameters=q₀, target_parameters = p₀, show_progress = false)
Result with 1350 solutions ================================== • 1350 non-singular solutions (36 real) • 0 singular solutions (0 real) • 1350 paths tracked • random seed: 387468

Let us keep the real solutions for which the variances are positive:

all_real_sols = realsolutions(R2)
true_real_solutions  = filter(s -> all(s[7:9] .> 0), all_real_sols);

There are 12 of those, which come in groups of 6, because the symmetric group acts on the solutions. We can filter this as follows

S₃ = SymmetricGroup(3)
relabeling = GroupActions(v -> map(p -> (v[1:3][p]..., v[4:6][p]..., v[7:9][p]...), S₃))
mults = multiplicities(true_real_solutions, group_action = relabeling)
2-element Array{Array{Int64,1},1}: [7, 8, 9, 10, 11, 12] [1, 2, 3, 4, 5, 6]

Each of the two vectors represents a group orbit. For instance, the first vector in mults contains all the indices i, such that true_real_solutions[i] is contained in the first orbit. Then,

i, j = mults[1][1], mults[2][1]
true_real_solutions[i], true_real_solutions[j]
([-0.396056, 0.0926208, 1.30343, 0.755685, -2.68724, -0.346632, 0.553889, 0.392889, 1.98223], [-1.74012, -0.182136, 2.92226, 0.184633, -1.60721, -0.33243, 1.11301, 0.0927855, 1.76594])

are the parameters of the two mixtures Gaussian that give our moments q₀.

The group action can also be exploited in the monodromy method itself. We can only track one point per orbit this reducing the complexity of the computation:

R_with_group_action = monodromy_solve(f - p, start_sol, q₀, parameters=p, 		
  																		group_action = relabeling; 
  																		target_solutions_count = 225)
MonodromyResult ================================== • 225 classes of solutions (modulo group action) (0 real) • return code → success • 3590 tracked paths

The full 1350 solutions can be returned as

vcat([relabeling(s) for s in solutions(R_with_group_action)]...)

Then, we can proceed with the parameter homotopy as above.

5. What's more?

Now, you should have obtained a solid foundation for using HomotopyContinuation.jl. But if you are eager to learn more, check our website

where you can find many more examples and guides that let you explore the full power of our package.