Andreas / Jul 24 2019
Remix of Julia by Nextjournal

Julia Packages or: How I Learned to Stop Worrying and Love the Black Box

In the broadest of brush-strokes, I have met two kinds of coding scientists in physics:

  1. The low-level, roll-your-own tinkerer that implements algorithms either from primitive linear algebra or from existing snippets of code lying around via copy-paste.
  2. The more results-oriented user of libraries and high-level solutions, satisfied with knowing that there's a function - that works most of the time - to achieve his goal of spitting out some numbers.

Especially in academia, both approaches have their merit. Whereas straw-scientist number two might be more efficient in producing results -as long as they are comfortable in the restricted space of possibilities the libraries provide, straw-scientist number one will come out with a deeper understanding which can be both intellectually satisfying (an important point in academia) and enable the scientist to go off the beaten path, free from the confines of a library author's mind.

Doing things oneself, it takes longer to achieve results and the scientist will have to tackle all the problems the original implementers tackled with - from bugs to performance pitfalls to undocumented but implied necessary conditions for the algorithms to work.

My personal coding journey started during my bachelors in physics, where I learned the most basic C, applied to e.g. implement some simple solvers for differential equations. Nothing that anyone should ever use, but it was fun to see a many-body problem solved with the code I've written. And subsequently coding simple problems helped me grasp some areas of physics better than I could have just with pen and paper.

But quite soon after I was introduced to Mathematica, perhaps the embodiment of what straw-scientist 2 - lover of black boxes - craves. Mathematica just does things. You tell it what to do, and it automagically happens. No fussing over which algorithms to use, where data comes from or how exactly an eigenvalue is found. It was comfortable and I stuck with Mathematica for a long time, using it as my go-to for even numerical work, sacrificing a bit of speed for peace of mind.

Mathematica has its dark sides though. There's performance issues and difficulty in reasoning about why some code is fast while other is slow. Version-control becomes cumbersome with the custom format Mathematica uses and it is harder to share code if only because not everyone has a license. Then there's the black box which I enjoyed for a long time but became uncomfortable with. I really wanted to look at the code, see how it is implemented and figure out why. Not only as a learning experience but also because sometimes you run into issues which require looking at code (or filing a but report).

At the beginning of my PhD, I set my eyes on julia: julia had the simplicity and speed to do everything myself and a young enough package ecosystem that I might need to do things myself.

I swung back into old habits, building systems where I've touched all involved code or at least most, leaving only the more basic linear algebra to libraries. This works well enough if I set my own schedule but with my summer of code project, deadlines came into my life. It was simply not feasible to implement most things myself now.

So I picked suitable packages (in my case Optim.jl and Zygote.jl) read the documentation and am now using them productively in my project.

And if something does not work or I wonder how it's done?

I can simply read the source! My black box can actually be opened and peered into. While the code is complex - especially in the macro-heavy Zygote - I can pinpoint most problems I encounter in the source and with little effort actually understand what's happening. Because at the end of the day, it is still mostly just julia. No new syntax, no new idioms. From the high to the low level. On top of that, the packages don't restrict me. I can use my custom constructs, mix the packages' functionalities (i.e. provide a gradient from Zygote to Optim), and expand them myself either locally or provide a PR directly to the source.

This is amazing for a workflow where you get far with existing tools and then focus your effort on the new aspects of a problem - the thing that has not been done before.

So let the black-boxes take you to the edge of what's been done before to then open them up, break things and push further.