Table of contents

*This post grew out of the big monoid discussion on the haskell-cafe mailing list.*

A very powerful application of monoids are 2-3 finger trees, first described by Ralf Hinze and Ross Patterson.

Basically, they allow you to write fast implementations for pretty much every abstract data type mentioned in Okasaki’s book on purely functional data structures. For example, you can do sequences, priority queues, search trees and priority search queues. Moreover, any fancy and custom data structures like interval trees or something for stock trading are likely to be implementable in this framework as well.

How can one tree be useful for so many different data structures? The answer: *monoids*! Namely, the finger tree works with elements that are related to a monoid, and all the different data structures mentioned above arise by different choices for this monoid.

Let me explain how this monoid magic works.

We begin with the simplest of all data structures, the linked list. As you well know, retrieving the `head`

is fast but random access is much slower:

`xs !! n`

needs O(n) i.e. linear time to retrieve the n-th element of the list. We would like to create a faster list-like data structure that reduces this to O(log n) i.e. logarithmic time.

For that, we use a binary tree that stores the elements `a`

at the leaves. Furthermore, every node is annotated with a value of type `v`

```
data Tree v a = Leaf v a
| Branch v (Tree v a) (Tree v a)
```

In other words, our trees look like this

```
v
/ \
v v
/ \ / \
v v v v
a a a / \
v v
a a
```

The leaves store the elements of our list from left to right.

```
toList :: Tree v a -> [a]
toList (Leaf _ a) = [a]
toList (Branch _ x y) = toList x ++ toList y
```

Annotations are fetched by

```
tag :: Tree v a -> v
tag (Leaf v _) = v
tag (Branch v _ _) = v
```

We can also implement the `head`

operation which retrieves the leftmost element

```
head :: Tree v a -> a
head (Leaf _ a) = a
head (Branch _ x _) = head x
```

Ok, so accessing the 1st leaf was easy, how about the 2nd, 3rd, the n-th leaf?

The solution is to annotate each subtree with its size.

`type Size = Int`

Our example tree has 5 leaves in total and the subtree on the right contains 3 leaves.

```
5
/ \
2 3
/ \ / \
1 1 1 2
a a a / \
1 1
a a
```

Thus, we set `v = Size`

and we want the annotations to fulfill

```
tag (Leaf ..) = 1
tag (Branch .. x y) = tag x + tag y
```

We can make sure that they are always correct by using *smart constructors*: instead of using `Leaf`

and `Branch`

to create a tree, we use custom functions

```
leaf :: a -> Tree Size a
leaf a = Leaf 1 a
branch :: Tree Size a -> Tree Size a -> Tree Size a
branch x y = Branch (tag x + tag y) x y
```

which automatically annotate the right sizes.

Given size annotations, we can now find the n-th leaf:

```
(!!) :: Tree Size a -> Int -> a
(Leaf _ a) !! 0 = a
(Branch _ x y) !! n
| n < tag x = x !! n
| otherwise = y !! (n - tag x)
```

And assuming that our tree is balanced, this will run in O(log n) time. But for now, let’s ignore balancing which would become relevant when implementing `cons`

or `tail`

.

Let’s consider a different data structure, the priority queue. It stores items that have different “priorities” and always returns the most urgent one first. We represent priorities as integers and imagine them as points in time so the smallest ones are more urgent.

`type Priority = Int`

Once again, we use a binary tree. This time, we imagine it as a tournament tree, so that every subtree is annotated with the smallest priority it contains

```
2
/ \
4 2
/ \ / \
16 4 2 8
a a a / \
32 8
a a
```

In other words, our annotations are to fulfill

```
tag (Leaf .. a) = priority a
tag (Branch .. x y) = tag x `min` tag y
```

with corresponding smart constructors. Given the tournament table, we can reconstruct the element that has the smallest priority in O(log n) time

```
winner :: Tree Priority a -> a
winner t = go t
where
go (Leaf _ a) = a
go (Branch _ x y)
| tag x == tag t = go x -- winner on left
| tag y == tag t = go y -- winner on right
```

Again, we forgo balancing and thus insertion or deletion.

As we can see, one and the same tree structure can be used for two quite different purposes, just by using different annotations. And by recognizing that the tags form a *monoid*, we can completely unify both implementations. Moreover, the retrieval operations `(!!)`

and `winner`

are actually special cases of **one and the same** function!

For brevity, we will denote the associative operation of a monoid with `<>`

`(<>) = mappend`

Think of the `<>`

as a small diamond symbol.

The observation is that we obtain the tag of a branch by combining its children with the monoid operation

`tag (Branch .. x y) = tag x <> tag y`

of the following monoid instances

```
instance Monoid Size where
mempty = 0
mappend = (+)
instance Monoid Priority where
mempty = maxBound
mappend = min
```

Hence, a unified smart constructor reads

```
branch :: Monoid v => Tree v a -> Tree v a -> Tree v a
branch x y = Branch (x <> y) x y
```

For leaves, the tag is obtained from the element. We can capture this in a type class

```
class Monoid v => Measured a v where
measure :: a -> v
```

so that the smart constructor reads

```
leaf :: Measured a v => a -> Tree v a
leaf a = Leaf (measure a) a
```

For our examples, the instances would be

```
instance Measured a Size where
measure _ = 1 -- one element = size 1
instance Measured Foo Priority where
measure a = priority a -- urgency of the element
```

How does the annotation at the top of a tree relate to the elements at the leaves? In our two examples, it was the total number of leaves and the least priority respectively. These values are *independent* of the actual shape of the tree. Thanks to the *associativity* of `<>`

, this is true for any monoid. For instance, the two trees

```
(v1<>v2) <> (v3<>v4) v1 <> (v2<>(v3<>v4))
/ \ / \
/ \ v1 v2 <> (v3<>v4)
/ \ a1 / \
v1 <> v2 v3 <> v4 v2 v3 <> v4
/ \ / \ a2 / \
v1 v2 v3 v4 v3 v4
a1 a2 a3 a4 a3 a4
```

have the same annotations

`(v1<>v2) <> (v3<>v4) = v1 <> (v2<>(v3<>v4)) = v1 <> v2 <> v3 <> v4`

as long as the sequences of leaves are the same. In general, the tag at the root of a tree with `n`

elements is

`measure a1 <> measure a2 <> measure a3 <> ... <> measure an`

While independent of the shape of the branching, i.e. on the placement of parenthesis, this may of course depend on the order of elements.

It makes sense to refer to this combination of measures of all elements as the `measure`

of the tree

```
instance Measured a v => Measured (Tree a v) v where
measure = tag
```

Thus, every tree is annotated with its `measure`

.

Our efforts culminate in the unification of the two search algorithms `(!!)`

and `winner`

. They are certainly similar; at each node, they descend into one of the subtrees which is chosen depending on the annotations. But to see their exact equivalence, we have to ignore branches and grouping for now because this is exactly what associativity “abstracts away”.

In a sequence of elements

`a1 , a2 , a3 , a4 , ... , an`

how to find say the 3rd one? Well, we scan the list from left to right and add 1 for each element encountered. As soon as the count exceeds 3, we have found the 3rd element.

```
1 -- is not > 3
1 + 1 -- is not > 3
1 + 1 + 1 -- is not > 3
1 + 1 + 1 + 1 -- is > 3
...
```

Similarly, how to find the element of a least priority say `v`

? Well, we can scan the list from left to right and keep track of the minimum priority so far. We have completed our search once it becomes equal to `v`

.

```
v1 -- still bigger than v
v1 `min` v2 -- still bigger than v
v1 `min` v2 `min` v3 -- still bigger than v
v1 `min` v2 `min` v3 `min` v4 -- equal to v!
...
```

In general terms, we are looking for the position where a predicate `p`

switches from `False`

to `True`

.

```
measure a1 -- not p
measure a1 <> measure a2 -- not p
measure a1 <> measure a2 <> measure a3 -- not p
measure a1 <> measure a2 <> measure a3 <> measure a4 -- p
... -- p
```

In other words, we are looking for the position `k`

where

```
p (measure a1 <> ... <> measure ak) is False
p (measure a1 <> ... <> measure ak <> measure a(k+1)) is True
```

The key point is that `p`

does not test single elements but combinations of them, and this allows us to do binary search! Namely, how to find the element where `p`

flips? Answer: divide the total `measure`

into two halves

```
x <> y
x = measure a1 <> ... <> measure a(n/2)
y = measure a(n/2+1) <> ... <> measure an
```

If `p`

is `True`

on the first half, then we have to look there for the flip, otherwise we have to search the second half. In the latter case, we would have to split `y = y1 <> y2`

and test `p (x <> y1)`

.

In the case of our data structures, the tree shape determines how the `measure`

is split into parts at each step. Here is the full procedure

```
search :: Measured a v => (v -> Bool) -> Tree v a -> Maybe a
search p t
| p (measure t) = Just (go mempty p t)
| otherwise = Nothing
where
go i p (Leaf _ a) = a
go i p (Branch _ l r)
| p (i <> measure l) = go i p l
| otherwise = go (i <> measure l) p r
```

Since we have annotated each branch with its `measure`

, testing `p`

takes no time at all.

Of course, this algorithm only works if `p`

really does flip from `False`

to `True`

exactly once. This is the case if `p`

fulfills

`p (x) implies p (x <> y) for all y`

and we say that `p`

is a *monotonic* predicate. Our two examples `(> 3)`

and `(== minimum)`

have this property and thus, we can finally conclude with

```
t !! k = search (> k)
winner t = search (== measure t)
```

I hope you have enjoyed this excursion into the land of trees and monoids. If you want to stay a bit longer, implement a data structure that do both look up the `k`

-th element and retrieve the element with the least priority at the same time. This is also known as *priority search queue*.

If you still long for more, the finger tree paper knows the way; I have tried to closely match their notation. In particular, they solve the balancing issue which turns the binary search on monoids into a truly powerful tool to construct about any fancy data structure with logarithmic access times you can imagine.