Fusion Property of Fold

An Explanation for Composability of Clojure Transducers

In A Tutorial on the Universality and Expressiveness of Fold, Graham Hutton presents in a clear and understandable way the advantages of programming by folds. Most of the core concepts exposed there, among which the universal and fusion property of folds are to be found in Bird's book. In this short article we'll apply some of those ideas to Clojure transducers, and we'll show how the fusion property implies the efficiency and composability of transducers. It's beyond of this writing to compare left and right folds for Clojure and their lazyness, since Clojure reduce is (notationally) a left fold and we'll assume all lists are left lists (say Clojure vectors). Note also that what follows is not about Clojure's fold as found in clojure.core.reducers.

The Universal Property of Fold

Following Hutton, we can define a fold function on lists by means of the following properties. For sets α\alphaand β\beta, defineAβ,αA_{\beta,\alpha}the set of all functions f:β×αβf\colon\beta\times\alpha\rightarrow\betawhich we'll call right actions ofα\alphaonβ\betaand when the functionffis clear from the context, we'll just writebab\cdot aforf(b,a)f(b, a). Thefold\mathsf{fold}function can be defined as

fold:Aβ,α×β×[α]\xrightarrowβ\mathsf{fold}\colon A_{\beta,\,\alpha}\times\beta\times[\alpha]\xrightarrow{}\beta

such that, for a given bβb\in\betathe following properties hold:

fold(f,b,[])=b(u1)fold(f,b,x¯:x)=fold(f,b,x¯)x(u2)\begin{aligned} &\mathsf{fold}(f,b,[\,]) = b \quad & (\mathsf{u1})\\ &\mathsf{fold}(f, b, \bar{x}\colon\!x) = \mathsf{fold}(f, b, \bar{x})\cdot x\quad & (\mathsf{u2}) \end{aligned}

wherex¯:x\bar{x}\colon\!xis the list obtained as conjunction of a listx¯\bar{x}with an elementxxofα\alpha(in Clojure(conjx¯x)(\mathsf{conj}\,\bar{x}\,x)).

Now for fixedbb, (u1) and (u2) form indeed a universal property, i.e. if such a function exists then it's unique and fold\mathsf{fold} is fully caracterized by these properties. Unicity is proven by means of induction: assume there's functions f\mathsf{f} and f\mathsf{f}^{'}which satisfy (u1) and (u2) but which disagree i.e. f(f,b,x^)f(f,b,x^)\mathsf{f}(f, b, \hat{x})\neq\mathsf{f}'(f, b, \hat{x})on some list x^=x¯:x\hat{x}=\bar{x}\colon\!x. Now by (u2) they also need to disagree onx¯\bar{x}, hencex^\hat{x}must have length zero contradicting (u1).

We can also restate (u2) by saying that for allf,bf,\,bthen as a function of [α][\alpha]inβ\betathe partial applicationfoldfb\mathsf{fold}_{f\,b}commutes with:x\colon\!xandx\cdot x, that is:

[α]\xrightarrowfoldfbβ:xx[α]\xrightarrowfoldfbβ\begin{matrix} \left[\alpha\right] & \xrightarrow{\mathsf{fold}_{f\,b}} & \beta \\ \mid & & \mid \\ \colon\!x & &\cdot x\\ \downarrow & & \downarrow \\ \left[\alpha\right] & \xrightarrow{\mathsf{fold}_{f\,b}} & \beta \end{matrix}

Existence of a fold function is proved by implementation in the context of the majority of programming languages. In Clojure, fold is the reduce function, but the universal property (u1) and (u2) can actually provide a constructive definition:

(defn fold [f i l]
  (if-some [last (peek (vec l))]
		(f (fold f i (butlast l)) last)
    i))

Since\ttfold\tt{fold}above satisfies (u1) and (u2) by definition, and we can easily prove it for\ttreduce\tt{reduce}, then they have to be the same function, but we can build some cheap function-equality check on integer vectors

(require '[clojure.test.check :as c])
(require '[clojure.test.check.generators :as gen])
(require '[clojure.test.check.properties :as p])

(defn =' [& fns]
 (let [pr (p/for-all [v (gen/vector gen/int)]
           (apply = (map #(% v) fns)))]
   (c/quick-check 100 pr)))

to get a hint our definition is sound, trying out some concrete example

(=' (partial fold + 0)
    (partial reduce + 0))
(=' (partial fold str "")
    (partial reduce str ""))

Expressing List Operations in Terms of Fold: Transducers

It is possible to express a lot of functions on lists in terms of fold, amongst the most popular are filter and map:

(defn filter' [pred]
  (fn [xs x] (if (pred x) (conj xs x) xs)))

(fold (filter' odd?) [] (range 10))
(defn map' [phi]
  (fn [xs x] (conj xs (phi x))))

(fold (map' inc) [] (range 9))

where(\ttfilterπ)(\tt{filter}'\,\pi)and(\ttmapϕ)(\tt{map}'\,\phi)are actions of natural numbers on lists of natural numbers. Clojure transducers bring this pattern one step further: they incapsulate list-like operations independently of the reducing function. Formally, transducers are transformations of actions i.e. functions

Aβ,α\xrightarrowAβ,αA_{\beta,\alpha}\xrightarrow{} A_{\beta,\alpha}

which behaves functorially with respect of folds, this will be explained later.

Functions like\ttfilter\tt{filter}and\ttmap\tt{map}in Clojure, when given a single argument, return a transducer. For instance, loot at

(let [s (with-out-str (clojure.repl/source filter))]
   (println (clojure.string/join (take 420 s))))

and let's see it applied to the\ttconj\tt{conj}function at first

(fold ((filter odd?) conj) [] (range 6))
(fold ((map inc) conj) [] (range 9))

and later to the\tt+\tt{+} function

(fold ((filter odd?) +) 0 (range 6))

The strong point for using transducers in practice is that they offer stack reducing operations in a composable way in which the input list will be visited just once. Take for instance:

(def coll [{:a 1} {:a 2} {:a 3} {:a 4}])

(->> coll
     (map :a)
     (filter odd?)
     (map inc)
     (reduce + 0))

At each step above a whole list is returned and fed the next computation which iterates through it again and again. With transducers this won't happen, the following snippet of code reads the input collection just once, encoding the transformations in a single action:

(def xf (comp (map :a)
              (filter odd?)
              (map inc)))

(reduce (xf +) 0 coll)             

which in clojure is (almost) the same of the simpler form

(transduce xf + 0 coll)

Later you'll also see the reason for this contravariant behaviour in the order of the function composition which is not the natural right-to-left order.

Fusion Property and the Composition of Folds

Having shown that many functions on lists can be expressed in terms of fold, when can we actually assert that a composition of folds is expressible in a fold of a single action? One step in this direction is given by the fusion property.

Given right α\alpha-actionsf:βαβf\colon\beta\rightarrow\alpha\rightarrow\beta and g:βαβg\colon\beta'\rightarrow\alpha\rightarrow\beta'we we call a function ϕ:ββ\phi:\beta\rightarrow\beta'a morphism from f to g if ϕ(f(b,x))=g(ϕ(b),x)\phi(f(b, x)) = g(\phi(b), x) holds for every bβ,xαb\in\beta, x\in\alpha.

We can prove thatfold\mathsf{fold}is stable under the application of morphisms, i.e. given a morphismϕ\phi of actions like the one above, then we have:

ϕfoldfb=foldgϕ(b)(fusion)\phi\circ\mathsf{fold}_{\,f\,b} = \mathsf{fold}_{\,g\,\phi(b)}\quad \quad\quad(\mathsf{fusion})

To prove the above equality we appeal to the universal property: if we can prove (u1) and (u2) of ϕfoldfb\phi\circ\mathsf{fold}_{\,f\,b}, then the equality above must hold for every list in[α][\alpha]. While it's trivial to see (u1), (u2) follows by combining commutative diagrams:

Invalid Formula: \begin{matrix} [\alpha] & \xrightarrow{\mathsf{fold}_{f\,b}} & \beta & \xrightarrow{\phi}&\beta'\\ \mid & & \mid & &\mid \\ \colon\!x & &\cdot x && \cdot x \\ \downarrow & & \downarrow & &\downarrow \\ [\alpha] & \xrightarrow{\mathsf{fold}_{f\,b}} & \beta & \xrightarrow{\phi}&\beta' \end{matrix} Error: ParseError: KaTeX parse error: Invalid size: '\alpha' at position 211: …ownarrow \\ [\̲a̲l̲p̲h̲a̲] & \xrightarro…

Now, if we want to compose folds as functions of lists we have to restrict to some specific class of actions. We say that an action on lists f:[α]×α[α]f\colon[\alpha]\times\alpha\rightarrow[\alpha] splits if there exists some functionf:ααf'\colon\alpha\rightarrow\alpha, such thatx¯a=x¯:f(a)\bar{x}\cdot a = \bar{x}\colon f'(a)for all aαa\in\alpha. Note that the actions defined in the examples above all split (with some formal imagination in the filter case). Given a splitting α\alpha-actionffand an action ggof α\alphaonβ\betawe define a function of actions tf:Aβ,αAβ,α\mathsf{t}_f\colon A_{\beta,\alpha} \rightarrow A_{\beta, \alpha}defined by

tf(g)(b,a)=g(b,f(x))\mathsf{t}_f(g)(b, a) = g(b, f'(x))

We can now state a transducing property for folds of splitting actions in terms of: for every splittingffand g we have:

foldgbfoldfl=foldtf(g)l(transducingproperty)\mathsf{fold}_{g\,b}\circ\mathsf{fold}_{f\,l} = \mathsf{fold}_{\,\mathsf{t}_f(g)\,l'}\quad\quad\mathsf{(transducing\, property)}

wherell'isfold(g,b,l)\mathsf{fold}(g, b, l).

To prove the transducer lemma it's enough to show that foldgb\mathsf{fold}_{g\,b} is a morphism between the actionsffandtf(g)\mathsf{t}_f(g):

Invalid Formula: \begin{alignedat}{2} \mathsf{fold}(g,b,\bar{x}\cdot a) &=&&\\ &\small{\mathit{splitting}}&&\\ &\quad\quad=\mathsf{fold}(g,b,\bar{x}\colon f'(a))=&&\\ &&\small{\mathit{fold}}&\\ &&&=g(\mathsf{fold}(g,b,\bar{x}), f'(a)) =\\ &&&=\mathsf{t}_f(g)(\mathsf{fold}(g,b,\bar{x}), a) \end{alignedat} Error: ParseError: KaTeX parse error: No such environment: alignedat at position 7: \begin{̲a̲l̲i̲g̲n̲e̲d̲a̲t̲}̲{2} \mathsf{fol…

Let's apply the lemma above to a simple Clojure case whereggis++andffis\tt((mapinc)conj)\tt{((map\,inc)\,conj)}, thenffsplits (by definition) andtf(g)\mathsf{t}_f(g)is\tt((mapinc)+)\tt{((map\,inc)\, +)}:

(=' (comp (partial reduce + 0)
          (partial map inc))
  
    (comp (partial reduce + 0)
          (partial reduce ((map inc) conj) []))

    (partial reduce ((map inc) +) 0))          

Now since function composition is associative, repeating the step above we also get for instance

(=' (comp (partial reduce + 0)
          (partial map inc)
          (partial filter odd?))

    (comp (partial reduce ((map inc) +) 0)
          (partial reduce ((filter odd?) conj) []))

    (partial reduce ((filter odd?) ((map inc) +)) 0)
    
    (partial reduce ((comp (filter odd?) (map inc)) +) 0))

which explains the countravariant behaviour in the composition of transducers, with respect to the composition of the non-transduced form.

Stateful transducers and cat

There's some transducers which escape the pure form of splitting actions as defined above, most notably

(clojure.repl/source cat)

to flatten list outputs on the fly:

(let [coll [{:a [1]} {:a [2]} {:a [3]}]
      xf (comp (map :a) cat)]

 (reduce (xf +) 0 coll))

and stateful transducers, like say the 0-ary form of\ttdistinct\tt{distinct}

(clojure.repl/source distinct)

or the 1-ary form of take-while

(clojure.repl/source take-while)

which uses the reduced trick to short-circuit the fold, allowing for very nice stuff like

(def terms [{:do true :val 1} 
            {:do true :val 2}
            {:do true :val 2}
            {:do true :val 1}
            {:do true :val 3}
            {:do false :val 4}
            {:do true :val 5}
            {:do true :val 6}])

(let [xf (comp (take-while :do)
               (map :val)
               (distinct))]

	(reduce (xf +) 0 terms))

If you'd like to discuss this, find me at @lo_zampino on Twitter. Or remix this article to explore transducers yourself!

© 2018 Nextjournal GmbH