Fold

In A Tutorial on the Universality and Expressiveness of Fold, Graham Hutton presents in a clear and understandable way the advantages of programming by folds. Most of the core concepts exposed there, among which the universal and fusion property of folds are to be found in Bird's book. In this short article we'll apply some of those ideas to Clojure transducers, and we'll show how the fusion property implies the efficiency and composability of transducers. It's beyond of this writing to compare left and right folds for Clojure and their lazyness, since Clojure reduce is (notationally) a left fold and we'll assume all lists are left lists (say Clojure vectors). Note also that what follows is not about Clojure's fold as found in clojure.core.reducers.

The Universal Property of Fold

Following Hutton, we can define a fold function on lists by means of the following properties. For sets and , definethe set of all functions which we'll call right actions ofonand when the functionis clear from the context, we'll just writefor. Thefunction can be defined as

fold ⁣:Aβ,α×β×[α]β\mathsf{fold}\colon A_{\beta,\,\alpha}\times\beta\times[\alpha]\xrightarrow{}\beta

such that, for a given the following properties hold:

fold(f,b,[])=b(u1)fold(f,b,xˉ ⁣: ⁣x)=fold(f,b,xˉ)x(u2)\begin{aligned} &\mathsf{fold}(f,b,[\,]) = b \quad & (\mathsf{u1})\\ &\mathsf{fold}(f, b, \bar{x}\colon\!x) = \mathsf{fold}(f, b, \bar{x})\cdot x\quad & (\mathsf{u2}) \end{aligned}

where is the list obtained as conjunction of a list with an element of (in Clojure ).

Now for fixed , (u1) and (u2) form indeed a universal property, i.e. if such a function exists then it's unique and is fully caracterized by these properties. Unicity is proven by means of induction: assume there's functions and which satisfy (u1) and (u2) but which disagree i.e. on some list . Now by (u2) they also need to disagree on , hence must have length zero contradicting (u1).

We can also restate (u2) by saying that for all then as a function of in the partial application commutes with and , that is:

[α]foldfbβ ⁣: ⁣xx[α]foldfbβ\begin{matrix} \left[\alpha\right] & \xrightarrow{\mathsf{fold}_{f\,b}} & \beta \\ \mid & & \mid \\ \colon\!x & &\cdot x\\ \downarrow & & \downarrow \\ \left[\alpha\right] & \xrightarrow{\mathsf{fold}_{f\,b}} & \beta \end{matrix}

Existence of a fold function is proved by implementation in the context of the majority of programming languages. In Clojure, fold is the reduce function, but the universal property (u1) and (u2) can actually provide a constructive definition:

(defn fold [f i l]
  (if-some [last (peek (vec l))]
		(f (fold f i (butlast l)) last)
    i))
user/fold

Since above satisfies (u1) and (u2) by definition, and we can easily prove it for , then they have to be the same function, but we can build some cheap function-equality check on integer vectors

(require '[clojure.test.check :as c])
(require '[clojure.test.check.generators :as gen])
(require '[clojure.test.check.properties :as p])

(defn =' [& fns]
 (let [pr (p/for-all [v (gen/vector gen/int)]
           (apply = (map #(% v) fns)))]
   (c/quick-check 100 pr)))
user/='

to get a hint our definition is sound, trying out some concrete example

(=' (partial fold + 0)
    (partial reduce + 0))
Map {:result: true, :num-tests: 100, :seed: 1560424113394}
(=' (partial fold str "")
    (partial reduce str ""))
Map {:result: true, :num-tests: 100, :seed: 1560424113644}

Expressing List Operations in Terms of Fold: Transducers

It is possible to express a lot of functions on lists in terms of fold, amongst the most popular are filter and map:

(defn filter' [pred]
  (fn [xs x] (if (pred x) (conj xs x) xs)))

(fold (filter' odd?) [] (range 10))
Vector(5) [1, 3, 5, 7, 9]
(defn map' [phi]
  (fn [xs x] (conj xs (phi x))))

(fold (map' inc) [] (range 9))
Vector(9) [1, 2, 3, 4, 5, 6, 7, 8, 9]

where and are actions of natural numbers on lists of natural numbers. Clojure transducers bring this pattern one step further: they incapsulate list-like operations independently of the reducing function. Formally, transducers are transformations of actions i.e. functions

Aβ,αAβ,αA_{\beta,\alpha}\xrightarrow{} A_{\beta,\alpha}

which behaves functorially with respect of folds, this will be explained later.

Functions like and in Clojure, when given a single argument, return a transducer. For instance, loot at

(let [s (with-out-str (clojure.repl/source filter))]
   (println (clojure.string/join (take 420 s))))

and let's see it applied to the function at first

(fold ((filter odd?) conj) [] (range 6))
Vector(3) [1, 3, 5]
(fold ((map inc) conj) [] (range 9))
Vector(9) [1, 2, 3, 4, 5, 6, 7, 8, 9]

and later to the function

(fold ((filter odd?) +) 0 (range 6))
9

The strong point for using transducers in practice is that they offer stack reducing operations in a composable way in which the input list will be visited just once. Take for instance:

(def coll [{:a 1} {:a 2} {:a 3} {:a 4}])

(->> coll
     (map :a)
     (filter odd?)
     (map inc)
     (reduce + 0))
6

At each step above a whole list is returned and fed the next computation which iterates through it again and again. With transducers this won't happen, the following snippet of code reads the input collection just once, encoding the transformations in a single action:

(def xf (comp (map :a)
              (filter odd?)
              (map inc)))

(reduce (xf +) 0 coll)             
6

which in clojure is (almost) the same of the simpler form

(transduce xf + 0 coll)
6

Later you'll also see the reason for this contravariant behaviour in the order of the function composition which is not the natural right-to-left order.

Fusion Property and the Composition of Folds

Having shown that many functions on lists can be expressed in terms of fold, when can we actually assert that a composition of folds is expressible in a fold of a single action? One step in this direction is given by the fusion property.

Given right -actions and we we call a function a morphism from f to g if holds for every .

We can prove that is stable under the application of morphisms, i.e. given a morphism of actions like the one above, then we have:

ϕfoldfb=foldgϕ(b)(fusion)\phi\circ\mathsf{fold}_{\,f\,b} = \mathsf{fold}_{\,g\,\phi(b)}\quad \quad\quad(\mathsf{fusion})

To prove the above equality we appeal to the universal property: if we can prove (u1) and (u2) of , then the equality above must hold for every list in . While it's trivial to see (u1), (u2) follows by combining commutative diagrams:

[α]foldfbβϕβ ⁣: ⁣xxx[α]foldfbβϕβ\begin{matrix} [\alpha] & \xrightarrow{\mathsf{fold}_{f\,b}} & \beta & \xrightarrow{\phi}&\beta'\\ \mid & & \mid & &\mid \\ \colon\!x & &\cdot x && \cdot x \\ \downarrow & & \downarrow & &\downarrow \\ [\alpha] & \xrightarrow{\mathsf{fold}_{f\,b}} & \beta & \xrightarrow{\phi}&\beta' \end{matrix}

Now, if we want to compose folds as functions of lists we have to restrict to some specific class of actions. We say that an action on lists splits if there exists some function , such that for all . Note that the actions defined in the examples above all split (with some formal imagination in the filter case). Given a splitting -action and an action of on we define a function of actions defined by

tf(g)(b,a)=g(b,f(x))\mathsf{t}_f(g)(b, a) = g(b, f'(x))

We can now state a transducing property for folds of splitting actions in terms of: for every splitting and g we have:

foldgbfoldfl=foldtf(g)l(transducingproperty)\mathsf{fold}_{g\,b}\circ\mathsf{fold}_{f\,l} = \mathsf{fold}_{\,\mathsf{t}_f(g)\,l'}\quad\quad\mathsf{(transducing\, property)}

where is .

To prove the transducer lemma it's enough to show that is a morphism between the actions and :

fold(g,b,xˉa)=splitting=fold(g,b,xˉ ⁣:f(a))=fold=g(fold(g,b,xˉ),f(a))==tf(g)(fold(g,b,xˉ),a)\begin{alignedat}{2} \mathsf{fold}(g,b,\bar{x}\cdot a) &=&&\\ &\small{\mathit{splitting}}&&\\ &\quad\quad=\mathsf{fold}(g,b,\bar{x}\colon f'(a))=&&\\ &&\small{\mathit{fold}}&\\ &&&=g(\mathsf{fold}(g,b,\bar{x}), f'(a)) =\\ &&&=\mathsf{t}_f(g)(\mathsf{fold}(g,b,\bar{x}), a) \end{alignedat}

Let's apply the lemma above to a simple Clojure case where is and is , then splits (by definition) and is :

(=' (comp (partial reduce + 0)
          (partial map inc))
  
    (comp (partial reduce + 0)
          (partial reduce ((map inc) conj) []))

    (partial reduce ((map inc) +) 0))          
Map {:result: true, :num-tests: 100, :seed: 1560424115164}

Now since function composition is associative, repeating the step above we also get for instance

(=' (comp (partial reduce + 0)
          (partial map inc)
          (partial filter odd?))

    (comp (partial reduce ((map inc) +) 0)
          (partial reduce ((filter odd?) conj) []))

    (partial reduce ((filter odd?) ((map inc) +)) 0)
    
    (partial reduce ((comp (filter odd?) (map inc)) +) 0))
Map {:result: true, :num-tests: 100, :seed: 1560424115463}

which explains the countravariant behaviour in the composition of transducers, with respect to the composition of the non-transduced form.

Stateful transducers and cat

There's some transducers which escape the pure form of splitting actions as defined above, most notably

(clojure.repl/source cat)

to flatten list outputs on the fly:

(let [coll [{:a [1]} {:a [2]} {:a [3]}]
      xf (comp (map :a) cat)]

 (reduce (xf +) 0 coll))
6

and stateful transducers, like say the 0-ary form of

(clojure.repl/source distinct)

or the 1-ary form of take-while

(clojure.repl/source take-while)

which uses the reduced trick to short-circuit the fold, allowing for very nice stuff like

(def terms [{:do true :val 1} 
            {:do true :val 2}
            {:do true :val 2}
            {:do true :val 1}
            {:do true :val 3}
            {:do false :val 4}
            {:do true :val 5}
            {:do true :val 6}])

(let [xf (comp (take-while :do)
               (map :val)
               (distinct))]

	(reduce (xf +) 0 terms))
6

If you'd like to discuss this, find me at @lo_zampino on Twitter. Or remix this article to explore transducers yourself!