Julia Summer of Code '19: Automatic Computation of Sparse Jacobians, Blog #1

Introduction

The Julia Summer of Code has officially kicked off from 27th May 2019 and so has this blog which I intend to use as a platform to share my experiences, both positive and negative, technical and non-technical that I gain throughout the course of this project.

Also, since this blog is primarily targeted for the people who do not or cannot follow with our conversations on #sparsediff at slack, I'll be repeating some of the more basic things just to be thorough and more complete. Before we jump into the technical discussions, it'd be a good idea to introduce myself.

I am Pankaj Mishra, an undergraduate student at IIT Kharagpur, India in the Department of Electronics and Electrical Communication. I will start my junior year from July 2019. This is both my first Julia and open source project and thus I am incredibly excited to be a part of it. My project, as the title says, is about the efficient and quick computation of sparse Jacobians, which feature quite commonly in real life differential equation problems.

The main keyword here is 'sparse'. A sparse matrix or sparse array is one in which most of the elements are zero. If we know the sparsity structure of a differential matrix already, it can be exploited to make the computational process a lot more efficient. This is done by smartly compacting the sparse matrix into a more dense and compact matrix and then by processing all the non-zero elements in a fewer number of function calls than what was earlier required.

The above image shows how the compression process works to convert a sparse matrix into a relatively dense one, and the process is implemented through realization of this problem as a famous graph theory problem: Graph Coloring.

Our main objective here is to be able to group together all columns which do not share any nonzero elements at a common row, more technically known as orthogonal columns, into a single set, while minimizing the total number of sets formed at the end of grouping process. Once a set of columns has been established, we can replace all the columns in that set with a single column containing the matrix sum of all the columns in that set. The columns which belong to the same set in the figure above are represented by the same color, which comes from the graph coloring.

What is Graph Coloring, and why do we care?

In order to model our problem, we first need to translate it from a matrix to a graph. And while doing so, we can either model it as a simple graph, or as a bipartite graph. Both the models will give us the results we are looking for using different algorithms and by using different amount of resources depending on the type of input we pass. However, for our project right now, we shall be focusing only on the simple graph model. The graph representation of our matrix is rather simple, it contains `n` vertices, each representing the n columns in our differential matrix. 2 vertices and have an edge between them if the and columns are non-orthogonal, i.e there exists some row such that and elements of our matrix are both non zero.

Thus, running a simple nested loop on our columns and checking for this condition on every iteration gives us our graph representation from the sparse matrix. Once the graph has been obtained, we need to color it following distance-1-coloring rules. What that essentially means is all vertex pairs with a distance of 1 (which is equivalent to all vertex pairs connected by an edge directly) will have different colors, and since 2 vertices or columns are connected if and only if they are non-orthogonal, we can be sure that all the vertices or columns painted with the same color are orthogonal. Once we compress the matrix using these groups of orthogonal columns, we employ forward difference method or Automatic Differentiation to calculate the directional derivatives along the compressed matrix direction. These steps, excluding the last, can be summarized using the following picture:

I plan on going over each of the individual steps involved in the whole project a lot more exhaustively later in the blog series as I start working on them.

After discussing with my mentors, we have decided to break down the entire project into few finite portions:

i. Convert sparse matrix to graph: The project involves handling and returning matrices, however, in the intermediary steps we process our matrices as graphs, and thus being able to convert a matrix to a graph is our first step.

ii. Coloring the graph: As explained in the blog, coloring the graph is necessary to efficiently group columns into sets of orthogonal vectors, and return the coloring vectors

iii. Use finite difference method along the direction returned by the coloring vectors to compute the value of the compressed Jacobian matrix.

iv. Employ dual numbers to allow calculation of the matrix elements using Automatic Differentiation.

v. Decompress the partials into the original Jacobian matrix.

What's next?

I intend to have the entire pipeline set up and working as expected by phase one evaluations (3rd week of June), after which I plan on focusing more and building up on the coloring module, potentially employing bipartite graphs and bicoloring heuristics to make the module more robust. I wish to dedicate the first two weeks of the coding period to the coloring algorithms and for handling matrix to graph conversions.

Additionally, I also want to look into LightGraphs as an alternative to the graph library that I have been using till now, which is a simple and lightweight graph library that I built from scratch. But until then, it's important to ensure that we stay on track and get the basic structure of the project up and running as soon as possible!

References

[1]https://www.researchgate.net/publication/51992342_What_Color_Is_Your_Jacobian_Graph_Coloring_for_Computing_Derivative