Pitch: ClojureScript support for lambdaisland/deep-diff

This document follows the structure outlined in Shape Up, chapter 6: write the Pitch.

Problem

lambdaisland/deep-diff is a Clojure library, it is currently not compatible with ClojureScript. This is a problem for deep-diff users who want to use it on ClojureScript, it is also a problem for us, because we currently can't offer deep diffs in kaocha-cljs.

Appetite

We are not using the actual Shape Up cycles, so we use this section differently. The appetite is about drawing the line, about enabling developers to decide when to cut or hammer scope. It is about finding the balance between time invested and returned value.

We have a strong appetite to get an initial ClojureScript version out there, even if imperfect. Experience with the Clojure version has shown that once we can put a working version in people's hands that there is a much higher chance people will contribute back incremental improvements. We expect to ship something in a number of days, or at most a week or two.

We'll expense this feature for €1024 from our OpenCollective once it is shipped.

Current Situation

deep-diff itself is a small code base. We have about 200 lines of diffing code, 160 lines of printing code, and a few dozen lines for a top-level API. However we have quite a bit of platform-specific (i.e. related to Java/JVM) code in there.

In the diffing code we have cases to deal with Java collections, like java.util.List, java.util.Set, and java.util.Map, and we can't just drop these because we rely on Clojure collections implementing these interfaces. In other words we deal with Clojure sequences and java lists uniformly through the java.util.List interface, the same goes for sets and maps.

In the printing code we also reference Java-specific types, but here it's not the collection but the value types like dates and UUIDs, which need a specific printed representation.

When converting to cljc we will need to implement handlers for the equivalent ClojureScript/JavaScript types.

However before we get there we need to make sure our dependencies are available as platform-independent cljc code.

What about other dependencies?

We have four external dependencies: fipp, clj-arrangement, puget, and clj-diff.

Fipp and clj-arrangement are already available as cljc, so we expect to be able to use those without problem. ClojureScript support was added to Fipp in 2015, and several commits and releases since then have further improved it. clj-arrangement is a small library which received ClojureScript support in 2016.

This leaves Puget and clj-diff.

Puget which is used for colorized pretty-printing has a PR from 2015/2016 which was never finished and eventually rejected. Looking at the discussion it seems Puget has similar issues as deep-diff itself, in that it registers handlers for specific classes or interfaces, which bumps into the fact that classes and inheritance work quite differently in JavaScript. So it seems porting Puget is our biggest initial blocker. This should also already give us some inspiration for the porting of the deep-diff printing code.

clj-diff does have a port to cljc, although it was not merged upstream, but it has been released under a separate clojars artifact. (see the still open PR). This might be good enough for us.

Solution

We will start by porting Puget, starting from the current master branch. We'll rename the files to cljc, set up a ClojureScript REPL, and try to get to the point where the namespaces compile and load. The existing PR might provide some inspiration here. We might take some shortcuts here like commenting out problematic parts that need more elaborate porting work.

Once the code compiles and loads we'll have to make sure it works, we'll create some example EDN values and try to print them. The test suite here can be an inspiration. Eventually we can look into porting the test suite using kaocha-cljs, which can help us find edge cases.

Once Puget is ported we can apply the same strategy to deep-diff. Upgrade or switch the dependencies so all upstream code is cljc, then covert our own files to cljc, make sure they compile and load, and then incrementally flesh out the ClojureScript behavior.

Finally we'll want to make sure the tests run for both Clojure and ClojureScript. We have a fairly elaborate test suite, and at this point will be content if the existing cases work on both platforms. We don't rule out regressions in areas not covered by the tests, but we'll take that risk, hoping that at that point users can help identify and possibly address problematic cases.

Rabbit Holes

We might not need the whole API surface of Puget, if it makes a big difference in time investment we will skip the parts that are not relevant to our use case.

We only care at this point about types that are present in the EDN spec (i.e. plain Clojure values). We can already provide support for native collections if it turns out it's an easy addition, but we don't want to invest too much time into supporting javascript objects, arrays, or other JavaScript types.

No-gos

If it makes sense we will submit changes upstream after this project. However to move fast we take ownership of these projects first, running forks or bundling dependencies if necessary. Lobbying for upstream acceptance of changes is outside the scope of this proposal.

Resources