Tushar Sinha / Jun 26 2019
Remix of Julia by Nextjournal

Blog #1 : Julia Seasons of Contributions [JSoC 2019]

Introduction

I have been selected for Julia Seasons of Contributions 2019 based on my project proposal to Google Summer of Code 2019.

My project is to implement Blossom V matching algorithm for computing minimum cost perfect matching in a general graph (graph needs not be bipartite). My project is mentored by Mathieu Besançon and Simon Schoelly.

This algorithm was given by Vladimir Kolmogorov in 2009 and is the current state of the art algorithm for computing minimum cost perfect matching in a general graph. Blossom V algorithm makes use of 2 key ideas which help it in achieving the best average running time amongst all the available implementation of Edmonds matching algorithm for weighted graphs. These ideas are :

  1. Making use of variable δ approach for updating dual variables.
  2. Making use of priority queues for selecting the edge with minimum slack.

Idea #1 was shown to be effective by Cook and Rohe [1] in their famous Blossom IV paper which remained the fastest available implementation for a long time. Idea #2 was shown to be effective by Mehlhorn and Schafer [2]. They made the use of concatenable priority queues and improved the worst case time complexity of Blossom IV fromto. Also, on an average, [2] turned out to be faster than [1].

Blossom V makes use of both the ideas together and thus achieves faster running time on an average. The worst case time complexity of Blossom V is alsojust like Blossom IV but it is still faster than [2] on an average.

Community Bonding Period [7th-27th May]

As proposed in my proposal, the plan for community bonding period was to go through the literature of matching algorithms. It is usually tough to find detailed discussion on matching algorithms in any textbook but there are some books which do this task fairly well. One of these is Matching Theory by M.D. Plummer and L. Lovasz. I also went through the paper of Blossom IV and Blossom V for one final time before the start of the coding period.

I also tried to understand author's original implementation of this algorithm written in C and was fairly successful in doing so.

Coding Period [1st week, 27th May-3rd June]

Although it was not a part of the plan but after discussion with my mentors I decided that it is worth implementing Edmonds maximum cardinality algorithm for unweighted graphs. The basic ideas of this algorithm are extended for the case of finding minimum cost perfect matching in a weighted graphs. A linear programming formulation of the problem is written which is solved with the use of combinatorial optimizations.

I am providing links to some of the great resources I found apart from the original paper from Edmonds [3] which helped me in understanding the algorithm .

  1. Video lectures by Prof. Shashank K. Mehta, Indian Institute of Technology, Kanpur.
  2. A short report covering theoretical analysis of the algorithm by Stanford University.
  3. Matching Theory by M.D. Plummer and L. Lovasz.
  4. An implementation of the same algorithm.

I would say it takes some efforts in thinking how to implement the algorithm even after being able to work in out on paper.

I will explain my implementation of the algorithm in this blog. I assume that the reader is aware about the basic terminologies used in the algorithm like what is a blossom, flower, stem etc. Also, I am assume that the reader understands the basic operations like shrinking, expanding a flower etc.

Explanation of the code.

Adding the dependencies.

using Pkg
Pkg.add("LightGraphs")
Pkg.add("SimpleTraits")
using LightGraphs, SimpleTraits

I would start with the vectors which store auxiliary information attached with a node in the graph. In my implementation, they are part of a structure named AuxStruct .

mutable struct AuxStruct{T} 
    mate :: Vector{T}         
    parent :: Vector{T}      
    base :: Vector{T}        
    queue :: Vector{T}       
    used :: Vector{Bool}     
    blossom :: Vector{Bool}  
end
  1. mate[i] stores the vertex which is matched to vertex i.
  2. parent[i] stores the parent of a vertex in the tree.
  3. base[i] stores the base of the flower to which vertex i belongs. It equals i if vertex i doesn't belong to any flower.
  4. queue is an array used in traversing the tree breadth first.
  5. used is boolean array to mark used vertex.
  6. blossom is boolean array to mark vertices in a blossom.
2.7s
function max_cardinality_matching end
@traitfn function max_cardinality_matching(g::AG::(!IsDirected)) where {T<:Integer, AG <:AbstractGraph{T}}
    
    nvg = nv(g)
    neg = ne(g)
    
    aux = AuxStruct{T}(T[],T[],T[],T[],Bool[],Bool[])

    #Initializing vectors
    aux.mate = fill(zero(nvg), nvg)
                                
    aux.parent = fill(zero(nvg), nvg)
                                
    sizehint!(aux.base, nvg)
    @inbounds for i in one(nvg):nvg
        push!(aux.base, i) 
    end
                                
    aux.queue = fill(zero(nvg), 2*nvg)
                                
    aux.used = fill(zero(nvg), nvg)
                                                                         
    aux.blossom = fill(zero(nvg), nvg)                                        

                            
		# Using a simple greedy algorithm to mark the preliminary matching to begin with 			 the  algorithm. 
		# This speeds up the matching algorithm by several times. 
		# Better heuristic based algorithm can be used which start with near optimal 		       matching thus reducing the number of augmentating paths and thus speeding up         the algorithm. 
    
    @inbounds for v in one(nvg):nvg
        if aux.mate[v]==zero(nvg)
            for v_neighbor in neighbors(g, v)
                if aux.mate[v_neighbor]==zero(nvg)
                    aux.mate[v_neighbor] = v
                    aux.mate[v] = v_neighbor
                    break
                end
            end
        end
    end
    
    #Iteratively going through all the vertices to find an unmarked vertex                                    
    @inbounds for u in one(nvg):nvg
        if aux.mate[u]==zero(nvg)            # If vertex u is not in matching.
            v = find_path!(aux, u, g)        # Find augmenting path starting with u as one of end points.
            while v!=zero(nvg)               # Alternate along the path from i to last_v (whole while loop is for                   that).
                pv = aux.parent[v]           # Increasing the number of matched edges by alternating the edges in 
                ppv = aux.mate[pv]           # augmenting path.
                aux.mate[v] = pv
                aux.mate[pv] = v
                v = ppv
            end
        end
    end
       
    matched_edges = Vector{Vector{T}}()      # Result vector containing end points of matched edges.

    @inbounds for u in one(nvg):nvg
        if u<aux.mate[u]
            temp = Vector{T}()
            push!(temp,u)
            push!(temp,aux.mate[u])
            push!(matched_edges,temp)       
        end
    end

    result = (mate = aux.mate, matched_edges = matched_edges)
    
    return result
end
max_cardinality_matching (generic function with 2 methods)

So, we start by initializing all the vectors of the structure AuxStruct .

Next I've implemented a very simple greedy algorithm to start with an initial matching to feed to the algorithm. Edmonds algorithm finds augmenting paths in a graph and flips the edges in the augmenting path to increase the number of matched edges. Now, the use of such greedy algorithm is that it is initially finding some matching in the graph and then feeding to the main algorithm. This matching will not be the best possible matching with maximum cardinality in general case but it will reduce the number of augmenting paths that the original Edmonds algorithm finds and hence it reduces the number of flipping operations that the algorithm has to do. Infact for a CompleteGraph(even number of vertices), the initialization is the correct final result and hence the main algorithm has to do no work at all. Suppose, we don't do this step of initialization, then there are no edges in matching initially and each edge will be added to the matching by the main algorithm which is very costly in terms of time complexity with a worst case complexity of .

Better initialization algorithms making use of heuristics can be used which will be even more efficient and will start with a near optimal matching. I'll implement them post JSoC once I am finished implementing Blossom V.

Then, we iteratively go through all the vertices of the matching and find an unmarked vertex on which any of the incident edges is not part of the matching. We call the find_path!() function which returns an augmenting path(if there is one) staring at that vertex and then flips the edges of the augmenting paths to increase the cardinality of the matching.

find_path!() function is defined as follows:

1.0s
function find_path!(aux, root, g)
    nvg = nv(g)   

    fill!(aux.used, 0)   
    
    fill!(aux.parent, zero(nvg))  
    
    @inbounds for i in one(nvg):nvg
        aux.base[i] = i 
    end
    
    aux.used[root] = one(nvg)
            
    qh = one(nvg) 
    qt = one(nvg)  
            
    aux.queue[qt] = root
    qt = qt + one(nvg)
    
    while qh<qt
        v = aux.queue[qh]
        qh = qh + one(nvg) 
        for v_neighbor in neighbors(g, v)
            if (aux.base[v] == aux.base[v_neighbor]) || (aux.mate[v] == v_neighbor)
                continue
            end
            
            # The edge closes the cycle of odd length, i.e. a flower is found. 
            # An odd-length cycle is detected when the following condition is met.
            # In this case, we need to compress the flower.
            if (v_neighbor == root) || (aux.mate[v_neighbor] != zero(nvg)) && (aux.parent[aux.mate[v_neighbor]] != zero(nvg))
                #######Code for compressing the flower######
                curbase = lowest_common_ancestor!(aux, v, v_neighbor, nvg)
                for i in one(nvg):nvg
                    aux.blossom[i] = 0
                end
                
                mark_path!(aux, v, curbase, v_neighbor, nvg)
                mark_path!(aux, v_neighbor, curbase, v, nvg)
                        
                for i in one(nvg):nvg
                    if aux.blossom[aux.base[i]] == one(nvg) 
                        aux.base[i] = curbase
                        if  aux.used[i] == 0
                            aux.used[i] = one(nvg)
                            aux.queue[qt] = i
                            qt = qt + one(nvg) 
                        end
                    end
                end
                ############################################
            
            # Otherwise, it is an "usual" edge, we act as in a normal breadth search. 
            # The only subtlety - when checking that we have not visited this vertex                yet, we must not look 
            # into the array aux.used, but instead into the array aux.parent as it is                filled for the odd visited vertices.
            # If we did not go to the top yet, and it turned out to be unsaturated,                  then we found an 
            # augmenting path ending at the top v_neighbor, return it.
                
            elseif aux.parent[v_neighbor] == zero(nvg)
                aux.parent[v_neighbor] = v
                if aux.mate[v_neighbor] == zero(nvg) 
                    return v_neighbor
                end
                v_neighbor = aux.mate[v_neighbor]
                aux.used[v_neighbor] = one(nvg)   
                aux.queue[qt] = v_neighbor
                qt = qt + one(nvg)
            end
        end
    end
                                    
    return zero(nvg)
end 
find_path! (generic function with 1 method)

This function basically does the task of building a breadth first search tree with the help of which we find augmenting paths and a blossom.

Once we find a blossom, we need to shrink the blossom into a single pseudo vertex and then we continue building the breadth first tree. But if we find an augmenting path, we just return it instantly. Otherwise, even after making the breadth first tree, if we don't find any augmenting path then we just return. However, please note that if we found any blossom, then we are still compressing the blossom.

The code for compressing the flower is a bit involved and hence I would suggest that you draw some cases on paper to see how it is happening. The code is commented for better understanding.

Compression of flower uses two other functions. One is a function named mark_path! which is as follows:

function mark_path!(aux, v, b, child, nvg) 
    while aux.base[v] != b
        aux.blossom[aux.base[v]] = one(nvg)
        aux.blossom[aux.base[aux.mate[v]]] = one(nvg)
        aux.parent[v] = child
        child = aux.mate[v]
        v = aux.parent[aux.mate[v]]
    end
    return nothing 
end
mark_path! (generic function with 1 method)

This function on the way from the top to the base of the flower, marks true for the vertices in the array aux.blossom[] and stores the ancestors for the even nodes in the tree. The parameter child is the son for the vertex v itself (with the help of this parameter we close the loop in the ancestors).

The other function is lowest_common_ancestor!() which is function to calculate the lowest common ancestor of 2 nodes in a tree. The function is as defined :

function lowest_common_ancestor!(aux, a, b, nvg)
    fill!(aux.used,false)
    
    ##Rise from vertex a to root, marking all even vertices
    while true
        a = aux.base[a]
        aux.used[a] = one(nvg)
        if aux.mate[a] == zero(nvg) 
            break
        end
        a = aux.parent[aux.mate[a]]
    end
      
    ##Rise from the vertex b until we find the labeled vertex
    while true
        b = aux.base[b]
        if aux.used[b] == one(nvg)
            return b
        end
        b = aux.parent[aux.mate[b]]
    end
end
lowest_common_ancestor! (generic function with 1 method)

Finally, time to see some results :

using LightGraphs
g = SimpleGraph(11)
add_edge!(g,1,2)
add_edge!(g,2,3)
add_edge!(g,3,4)
add_edge!(g,4,1)   
add_edge!(g,3,5)
add_edge!(g,5,6)
add_edge!(g,6,7)
add_edge!(g,7,5)
add_edge!(g,5,8)
add_edge!(g,8,9)
add_edge!(g,10,11)
true
max_cardinality_matching(g)
(mate = [2, 1, 4, 3, 6, 5, 0, 9, 8, 11, 10], matched_edges = Array{Int64,1}[[1, 2], [3, 4], [5, 6], [8, 9], [10, 11]])
g = SimpleGraph(12)
add_edge!(g,1,2)
add_edge!(g,1,3)
add_edge!(g,3,2)
add_edge!(g,5,2)
add_edge!(g,6,2)
add_edge!(g,4,2)
add_edge!(g,9,7)
add_edge!(g,8,7)
add_edge!(g,8,11)
add_edge!(g,9,10)
add_edge!(g,9,8)
add_edge!(g,10,11)
add_edge!(g,10,12)
true
max_cardinality_matching(g)
(mate = [3, 4, 1, 2, 0, 0, 9, 11, 7, 12, 8, 10], matched_edges = Array{Int64,1}[[1, 3], [2, 4], [7, 9], [8, 11], [10, 12]])

Coding Period [2nd, 3rd, 4th week]

I have spent these weeks into experimenting with various parts of the Blossom V algorithm mainly fractional initialization and variable delta approach. After struggling a lot, I have been able to at least figure out pseudo code of various parts of the algorithm. I will write my next blog once I reach a logical end. I am currently working on finally unifying various parts and writing my final implementation.

This work is under development and I think my next blog will be on various initialization techniques for Blossom V namely greedy initialization and fractional initialization. Stay tuned for this.

References

  1. W. Cook and A. Rohe. Computing minimum-weight perfect matchings.INFORMS Journal on Computing, 11(2):138–148,February 1999. Computer code available at http://www2.isye.gatech.edu/ ̃ wcook/blossom4/.
  2. K. Mehlhorn and G. Sch ̈afer. Implementation of O(nmlogn) weighted matchings in general graphs: the power of data struc-tures.Journal of Experimental Algorithmics (JEA), 7:4, 2002