Blog #1 : Julia Seasons of Contributions [JSoC 2019]
Introduction
I have been selected for Julia Seasons of Contributions 2019 based on my project proposal to Google Summer of Code 2019.
My project is to implement Blossom V matching algorithm for computing minimum cost perfect matching in a general graph (graph needs not be bipartite). My project is mentored by Mathieu Besançon and Simon Schoelly.
This algorithm was given by Vladimir Kolmogorov in 2009 and is the current state of the art algorithm for computing minimum cost perfect matching in a general graph. Blossom V algorithm makes use of 2 key ideas which help it in achieving the best average running time amongst all the available implementation of Edmonds matching algorithm for weighted graphs. These ideas are :
- Making use of variable δ approach for updating dual variables.
- Making use of priority queues for selecting the edge with minimum slack.
Idea #1 was shown to be effective by Cook and Rohe [1] in their famous Blossom IV paper which remained the fastest available implementation for a long time. Idea #2 was shown to be effective by Mehlhorn and Schafer [2]. They made the use of concatenable priority queues and improved the worst case time complexity of Blossom IV from
Blossom V makes use of both the ideas together and thus achieves faster running time on an average. The worst case time complexity of Blossom V is also
Community Bonding Period [7th-27th May]
As proposed in my proposal, the plan for community bonding period was to go through the literature of matching algorithms. It is usually tough to find detailed discussion on matching algorithms in any textbook but there are some books which do this task fairly well. One of these is Matching Theory by M.D. Plummer and L. Lovasz. I also went through the paper of Blossom IV and Blossom V for one final time before the start of the coding period.
I also tried to understand author's original implementation of this algorithm written in C and was fairly successful in doing so.
Coding Period [1st week, 27th May-3rd June]
Although it was not a part of the plan but after discussion with my mentors I decided that it is worth implementing Edmonds maximum cardinality algorithm for unweighted graphs. The basic ideas of this algorithm are extended for the case of finding minimum cost perfect matching in a weighted graphs. A linear programming formulation of the problem is written which is solved with the use of combinatorial optimizations.
I am providing links to some of the great resources I found apart from the original paper from Edmonds [3] which helped me in understanding the algorithm .
- Video lectures by Prof. Shashank K. Mehta, Indian Institute of Technology, Kanpur.
- A short report covering theoretical analysis of the algorithm by Stanford University.
- Matching Theory by M.D. Plummer and L. Lovasz.
- An implementation of the same algorithm.
I would say it takes some efforts in thinking how to implement the algorithm even after being able to work in out on paper.
I will explain my implementation of the algorithm in this blog. I assume that the reader is aware about the basic terminologies used in the algorithm like what is a blossom, flower, stem etc. Also, I am assume that the reader understands the basic operations like shrinking, expanding a flower etc.
Explanation of the code.
Adding the dependencies.
using Pkg Pkg.add("LightGraphs") Pkg.add("SimpleTraits")
using LightGraphs, SimpleTraits
I would start with the vectors which store auxiliary information attached with a node in the graph. In my implementation, they are part of a structure named AuxStruct
.
mutable struct AuxStruct{T} mate :: Vector{T} parent :: Vector{T} base :: Vector{T} queue :: Vector{T} used :: Vector{Bool} blossom :: Vector{Bool} end
mate[i]
stores the vertex which is matched to vertex i.parent[i]
stores the parent of a vertex in the tree.base[i]
stores the base of the flower to which vertex i belongs. It equals i if vertex i doesn't belong to any flower.queue
is an array used in traversing the tree breadth first.used
isblossom
is boolean array to mark vertices in a blossom.
function max_cardinality_matching end function max_cardinality_matching(g::AG::(!IsDirected)) where {T<:Integer, AG <:AbstractGraph{T}} nvg = nv(g) neg = ne(g) aux = AuxStruct{T}(T[],T[],T[],T[],Bool[],Bool[]) #Initializing vectors aux.mate = fill(zero(nvg), nvg) aux.parent = fill(zero(nvg), nvg) sizehint!(aux.base, nvg) for i in one(nvg):nvg push!(aux.base, i) end aux.queue = fill(zero(nvg), 2*nvg) aux.used = fill(zero(nvg), nvg) aux.blossom = fill(zero(nvg), nvg) # Using a simple greedy algorithm to mark the preliminary matching to begin with the algorithm. # This speeds up the matching algorithm by several times. # Better heuristic based algorithm can be used which start with near optimal matching thus reducing the number of augmentating paths and thus speeding up the algorithm. for v in one(nvg):nvg if aux.mate[v]==zero(nvg) for v_neighbor in neighbors(g, v) if aux.mate[v_neighbor]==zero(nvg) aux.mate[v_neighbor] = v aux.mate[v] = v_neighbor break end end end end #Iteratively going through all the vertices to find an unmarked vertex for u in one(nvg):nvg if aux.mate[u]==zero(nvg) # If vertex u is not in matching. v = find_path!(aux, u, g) # Find augmenting path starting with u as one of end points. while v!=zero(nvg) # Alternate along the path from i to last_v (whole while loop is for that). pv = aux.parent[v] # Increasing the number of matched edges by alternating the edges in ppv = aux.mate[pv] # augmenting path. aux.mate[v] = pv aux.mate[pv] = v v = ppv end end end matched_edges = Vector{Vector{T}}() # Result vector containing end points of matched edges. for u in one(nvg):nvg if u<aux.mate[u] temp = Vector{T}() push!(temp,u) push!(temp,aux.mate[u]) push!(matched_edges,temp) end end result = (mate = aux.mate, matched_edges = matched_edges) return result end
So, we start by initializing all the vectors of the structure AuxStruct
.
Next I've implemented a very simple greedy algorithm to start with an initial matching to feed to the algorithm. Edmonds algorithm finds augmenting paths in a graph and flips the edges in the augmenting path to increase the number of matched edges. Now, the use of such greedy algorithm is that it is initially finding some matching in the graph and then feeding to the main algorithm. This matching will not be the best possible matching with maximum cardinality in general case but it will reduce the number of augmenting paths that the original Edmonds algorithm finds and hence it reduces the number of flipping operations that the algorithm has to do. Infact for a CompleteGraph(even number of vertices), the initialization is the correct final result and hence the main algorithm has to do no work at all. Suppose, we don't do this step of initialization, then there are no edges in matching initially and each edge will be added to the matching by the main algorithm which is very costly in terms of time complexity with a worst case complexity of
Better initialization algorithms making use of heuristics can be used which will be even more efficient and will start with a near optimal matching. I'll implement them post JSoC once I am finished implementing Blossom V.
Then, we iteratively go through all the vertices of the matching and find an unmarked vertex on which any of the incident edges is not part of the matching. We call the find_path!()
function which returns an augmenting path(if there is one) staring at that vertex and then flips the edges of the augmenting paths to increase the cardinality of the matching.
find_path!()
function is defined as follows:
function find_path!(aux, root, g) nvg = nv(g) fill!(aux.used, 0) fill!(aux.parent, zero(nvg)) for i in one(nvg):nvg aux.base[i] = i end aux.used[root] = one(nvg) qh = one(nvg) qt = one(nvg) aux.queue[qt] = root qt = qt + one(nvg) while qh<qt v = aux.queue[qh] qh = qh + one(nvg) for v_neighbor in neighbors(g, v) if (aux.base[v] == aux.base[v_neighbor]) || (aux.mate[v] == v_neighbor) continue end # The edge closes the cycle of odd length, i.e. a flower is found. # An odd-length cycle is detected when the following condition is met. # In this case, we need to compress the flower. if (v_neighbor == root) || (aux.mate[v_neighbor] != zero(nvg)) && (aux.parent[aux.mate[v_neighbor]] != zero(nvg)) #######Code for compressing the flower###### curbase = lowest_common_ancestor!(aux, v, v_neighbor, nvg) for i in one(nvg):nvg aux.blossom[i] = 0 end mark_path!(aux, v, curbase, v_neighbor, nvg) mark_path!(aux, v_neighbor, curbase, v, nvg) for i in one(nvg):nvg if aux.blossom[aux.base[i]] == one(nvg) aux.base[i] = curbase if aux.used[i] == 0 aux.used[i] = one(nvg) aux.queue[qt] = i qt = qt + one(nvg) end end end ############################################ # Otherwise, it is an "usual" edge, we act as in a normal breadth search. # The only subtlety - when checking that we have not visited this vertex yet, we must not look # into the array aux.used, but instead into the array aux.parent as it is filled for the odd visited vertices. # If we did not go to the top yet, and it turned out to be unsaturated, then we found an # augmenting path ending at the top v_neighbor, return it. elseif aux.parent[v_neighbor] == zero(nvg) aux.parent[v_neighbor] = v if aux.mate[v_neighbor] == zero(nvg) return v_neighbor end v_neighbor = aux.mate[v_neighbor] aux.used[v_neighbor] = one(nvg) aux.queue[qt] = v_neighbor qt = qt + one(nvg) end end end return zero(nvg) end
This function basically does the task of building a breadth first search tree with the help of which we find augmenting paths and a blossom.
Once we find a blossom, we need to shrink the blossom into a single pseudo vertex and then we continue building the breadth first tree. But if we find an augmenting path, we just return it instantly. Otherwise, even after making the breadth first tree, if we don't find any augmenting path then we just return. However, please note that if we found any blossom, then we are still compressing the blossom.
The code for compressing the flower is a bit involved and hence I would suggest that you draw some cases on paper to see how it is happening. The code is commented for better understanding.
Compression of flower uses two other functions. One is a function named mark_path!
which is as follows:
function mark_path!(aux, v, b, child, nvg) while aux.base[v] != b aux.blossom[aux.base[v]] = one(nvg) aux.blossom[aux.base[aux.mate[v]]] = one(nvg) aux.parent[v] = child child = aux.mate[v] v = aux.parent[aux.mate[v]] end return nothing end
This function on the way from the top to the base of the flower, marks true for the vertices in the array aux.blossom[] and stores the ancestors for the even nodes in the tree. The parameter child is the son for the vertex v itself (with the help of this parameter we close the loop in the ancestors).
The other function is lowest_common_ancestor!() which is
function lowest_common_ancestor!(aux, a, b, nvg) fill!(aux.used,false) ##Rise from vertex a to root, marking all even vertices while true a = aux.base[a] aux.used[a] = one(nvg) if aux.mate[a] == zero(nvg) break end a = aux.parent[aux.mate[a]] end ##Rise from the vertex b until we find the labeled vertex while true b = aux.base[b] if aux.used[b] == one(nvg) return b end b = aux.parent[aux.mate[b]] end end
Finally, time to see some results :
using LightGraphs g = SimpleGraph(11) add_edge!(g,1,2) add_edge!(g,2,3) add_edge!(g,3,4) add_edge!(g,4,1) add_edge!(g,3,5) add_edge!(g,5,6) add_edge!(g,6,7) add_edge!(g,7,5) add_edge!(g,5,8) add_edge!(g,8,9) add_edge!(g,10,11)
max_cardinality_matching(g)
g = SimpleGraph(12) add_edge!(g,1,2) add_edge!(g,1,3) add_edge!(g,3,2) add_edge!(g,5,2) add_edge!(g,6,2) add_edge!(g,4,2) add_edge!(g,9,7) add_edge!(g,8,7) add_edge!(g,8,11) add_edge!(g,9,10) add_edge!(g,9,8) add_edge!(g,10,11) add_edge!(g,10,12)
max_cardinality_matching(g)
Coding Period [2nd, 3rd, 4th week]
I have spent these weeks into experimenting with various parts of the Blossom V algorithm mainly fractional initialization and variable delta approach. After struggling a lot, I have been able to at least figure out pseudo code of various parts of the algorithm. I will write my next blog once I reach a logical end. I am currently working on finally unifying various parts and writing my final implementation.
This work is under development and I think my next blog will be on various initialization techniques for Blossom V namely greedy initialization and fractional initialization. Stay tuned for this.
References
- W. Cook and A. Rohe. Computing minimum-weight perfect matchings.INFORMS Journal on Computing, 11(2):138–148,February 1999. Computer code available at http://www2.isye.gatech.edu/ ̃ wcook/blossom4/.
- K. Mehlhorn and G. Sch ̈afer. Implementation of O(nmlogn) weighted matchings in general graphs: the power of data struc-tures.Journal of Experimental Algorithmics (JEA), 7:4, 2002