On types and tries

In Haskell, as in linguistic semantics, the universe is partitioned into types.

In semantics, DPs have type (et)t, transitive verbs type eet, nouns et, and adjectives (et)et.1 These types dictate what can combine with what. The Haskell function (++), pronounced ‘append’, which combines two lists (so that ['a','b','c'] ++ ['d','e'] = ['a','b','c','d','e']), is restricted to work only on lists. A program which tries to append non-lists will be rejected by ghc before even running.2 Dictating what can combine with what is not only a restriction on what is permitted, but it is also an assurance: a function of type a -> b makes a promise that whenever you give it an a, you will get a b back.

Haskell’s type system is more sophisticated than what we typically use in linguistic semantics, as it makes use of polymorphism. For example, the function head takes a list as argument, and returns its first element. It will work on any list, regardless of the type of things it contains. We might describe its type in words in the following way:

For any type a, head takes as its input a list of elements of type a, and returns an element of type a

The above description of the type of the head function is rendered into the Haskell language as follows:

head :: [a] -> a

Here the explicit quantification over types in the English description (‘for any type a') has been made implicit in Haskell.3 The fact that the a is being quantified over is syntactically explicit because it begins with a lower case character. Type names beginning in lower case are variables. We could have chosen any other variable name, for example typeOfElement:

head :: [typeOfElement] -> typeOfElement

All types in Haskell either begin with capital letters (Int, Char, String, etc) or are non-character symbols (such as the unit type, ()). This holds true for type functions as well, such as Trie and Tree, which begin with a capital letter, or the list type function [], which consists of square brackets. When you define a new type, you must adhere to this convention as well.

Defining new types in Haskell

There are two main ways to define new types in Haskell. Using the keyword type, you can define a nickname for a previously existing type. For example, the type String is just a nickname for lists of characters. We could have defined it ourselves in the following way:

type String = [Char]

If we query ghc about this type, we get the following information:

> :info String
type String :: *
type String = [Char]
        -- Defined in 'GHC.Base'

This tells us that Haskell views strings as a type (by writing type String :: *), and also shows its definition, along with where it was defined (in this case in the module GHC.Base).

In the previous post, we defined tries as lists of trees with nodes being pairs of some type with a boolean value:

type Trie a = [Tree (a,Bool)]

If we query ghc about this type, we get the following information:

> :info Trie
type Trie :: * -> *
type Trie a = [Tree (a,Bool)]
        -- Defined at trie.hs:22:1

This tells us that Haskell views tries as a type function (by writing type Trie :: * -> *), and also shows its definition, along with where it was defined (in this case in the program file I wrote in line 22, starting at character 1).

Note that a trie as here defined is a type function, as it requires a type a of contents. We can imagine rewriting the above so as to make the abstraction over the argument explicit:

type Trie = \\a -> [Tree (a,Bool)] -- NOT VALID HASKELL!!!

Here, the double slash ‘\\’ is supposed to be the type level version of the single slash ‘\’ which is used at the term level to represent lambda abstraction. Unfortunately, there is no ‘type level’ lambda abstraction in Haskell, and so the above formulation of tries is only good for our imagination.

The entire point of defining nicknames (using the type keyword) is that they can make things easier for us to read, especially if the types we are using get very complicated.

More useful is populating the universe with new things! This is done with the keyword data. We defined trees as a wholly new kind of thing using this keyword:

data Tree a = Node a [Tree a]

If we ask ghc about this type, we get the following information:

> :i Tree
type Tree :: * -> *
data Tree a = Node a [Tree a]
       -- Defined at trie.hs:2:1

This tells us again that the type Tree is a function from types to types, along with how and where it was defined.

When we define new kinds of things, we must also define how such things are created. We do this by providing an exhaustive list of cases on the right hand side of a data declaration. In the case of Tree, there is only one way to create a tree containing elements of some type a, and that is by applying the type constructor Node first to some object of type a, and then to a list of trees (containing elements of that type). We can ask ghc what it knows about Node:

> :t Node
Node :: a -> [Tree a] -> Tree a

Haskell thinks that Node, the constructor for the type Tree, is a function from entities and lists of trees to a tree. This can be represented more perspicuously via an alternative syntax4

data Tree a where
  Node :: a -> [Tree a] -> Tree a

Here we see that we are defining a new type (Tree), along with ways of creating such objects (here only using Node).

We are not limited to a single way of constructing a new type. We can define a type of (card) suits:

data Suit = Clubs | Diamonds | Hearts | Spades

Here we have four ways of constructing an object of type Suit. In the GADT notation, this becomes:

data Suit where
  Clubs :: Suit
  Diamonds :: Suit
  Hearts :: Suit
  Spades :: Suit

Here we see explicitly that our four type constructors are simply the four ways of being a suit.

Although type constructors look like functions (or terms), they must begin with a capital letter (or in some cases a symbol). This is because they are treated differently than normal functions. Whereas normal functions perform computations (manipulating their inputs to construct a result) type constructors already are a result. There is no more computation to be done. Thus, Clubs already is a term of type Suit, and Node "hello" [] already is a term of type Tree String.

We can look at more complicated data types, which make more interesting use of type constructors. We can imagine a tiny programming language:

data Program a where
  Value :: a -> Program a
  Add :: Program Int -> Program Int -> Program Int
  Equals :: Program a -> Program a -> Program Bool
  IfThenElse :: Program Bool -> Program a -> Program a -> Program a
  Print :: Program a -> Program ()
  Sequence :: Program () -> Program a -> Program a

We have defined the type function Program, the objects inhabiting which can be constructed in the six ways given above. An object of type Program a for any type a we can think of intuitively as a program for computing a value of type a. We can turn any object of any type into a program for computing itself using the constructor Value. The constructor Add lets us combine two programs for computing integers into a larger one for computing the same. The constructor Equals combines two programs computing objects of the same type into a program for constructing a boolean value. The constructor IfThenElse implements a conditional branching operator. The program Print prints the value computed by its input program to the screen, and then does nothing (()). The program Sequence combines a program which computes nothing with another that computes something. One object of type Program () is given below:

Sequence
  (IfThenElse (Equals
                (Add (Value 3) (Value 4))
                (Value 7))
    (Print (Value "yes, 3 + 4 = 7"))
    (Print (Value "no, 3 + 4 /= 7")))
  (Print (Value "all done"))

This object intuitively represents the following program:

if [ $((3 + 4)) -eq 7 ]; then
    printf "yes, 3 + 4 = 7"
else
    printf "no, 3 + 4 /= 7"
fi
printf "all done"

Properties of types

Typeclasses describe properties of types, or type functions. Having properties available in our syntax allows us to restrict quantification over types; we can write things like forall a. P a => Q a (for all a, if P a then Q a) instead of just forall a. Q a (for all a, Q a). Here, I use the ‘double arrow’ => as implication to avoid confusion with the single arrow -> which is used in Haskell for function spaces. As a concrete example, consider a function of type [a] -> a. Such a function will take a list of elements of any type, and return a single element of that type. Can such a function exist? You might look above in this post, and answer ‘yes’, as head has just this type! However, head does not actually fulfill the promise that its type makes; there is one list that it fails to provide an output for!

> head []
 *** Exception: Prelude.head: empty list

Given a non-empty list, there are many ways of returning some element it might contain; you might take the first element, the second element, … the last element. However, given the empty list, you must conjure an element of an arbitrary type out of thin air! While we might be prepared to do this in certain circumstances, perhaps with Char we can choose 'a', with Int we can choose 7, with String we can choose "a black winter day", and with a -> a we can choose \x -> x. But how should we choose an element of Void (the type with no inhabitants)?!

One natural solution is to say that we do not have a function of type [a] -> a (forall a. [a] -> a), but rather one of type Pointed a => [a] -> a (forall a. Pointed a => [a] -> a), where Pointed a restricts the quantification to just those types which are pointed, in the sense that they have a predetermined element which you can create at will, via a function point. Haskell lets us define what it means for a type to have a property by means of the typeclass mechanism.

class Pointed a where
  point :: a

A type has the property of being pointed just in case it has a designated point. We can specify that a particular type is pointed by defining a typeclass instance for it:

instance Pointed Char where
  point = 'a'
instance Pointed Int where
  point = 7
instance Pointed String where
  point = "a black winter day"
instance Pointed (a -> a) where
  point = \x -> x

Then we can define a function which will give us a value of type a for any list of elements of that type provided that a is a pointed type.

headPointed :: Pointed a => [a] -> a
headPointed [] = point
headPointed (c:cs) = c

While this example is somewhat contrived, there are more relevant properties that types could have. In particular, if a type has the structure of a monoid, then it not only has a designated element e, but also an associative binary operation + which satisfies the identity laws below:

  1. e + x = x
  2. x + e = x

A type is monoidal if it is an instance of the typeclass Monoid:5

class Monoid a where
  mempty :: a
  mappend :: a -> a -> a

If we want to create a function which will return an element of type a given a list of such, provided a is monoidal, then we might wish to combine the elements of the input list using mappend:

mconcat :: Monoid a => [a] -> a
mconcat [] = mempty
mconcat (x:xs) = mappend x (mconcat xs)

Lists are monoidal, where the mempty list is the empty list, and mappend is list concatenation:

instance Monoid [a] where
  mempty = []
  mappend = (++)

Our function mconcat would take a list of lists and concatenate them all together from left to right:

> mconcat ["abc","de","fghijk"]
"abcdefghijk"

Some more built-in typeclasses are Eq (for the types of things which can be tested to see whether they are equal), Ord (for the types of things which can be compared to see which is bigger), and Show (for the types of things which can be represented as strings). Whenever we write thing1 == thing2, we are assuming that thing1 and thing2 belong to a type which is an instance of the class Eq.

Making tries

Although we have defined tries, and built functions to search through them, we still have no practical way of constructing them (other than by hand). We walk though the definition of a function mkTrie, which takes a list of strings as input, and assembles a character trie.

We will define this function in terms of a simpler function addToTrie, which takes a single string and a trie as input, and adds that string to the trie. Then mkTrie is just the result of applying this function over and over to each element of its input list, starting from the empty trie.

-- mkTrie [a1,a2,...,ak] = addToTrie a1 (addToTrie a2 ... (addToTrie ak []))
mkTrie [] = []
mkTrie (s:ss) = addToTrie s (mkTrie ss)

This kind of pattern occurs over and over when defining functions over lists; when the input list is empty, do one thing, and when the input list has at least one element, combine that element somehow with the result of performing the function to the remainder of the list:

f [] = whatToDoWhenEmpty
f (s:ss) = howToCombine s (f ss)

Such repeated patterns should be reified into a single generic function, so as to avoid code duplication. This generic function must be told what to do when the input list is empty, and how to combine list elements with the result of the computation in the other case.

foldr howToCombine whatToDoWhenEmpty = f
  where
    f [] = whatToDoWhenEmpty
    f (s:ss) = howToCombine s (f ss)

Having reified this pattern, we can reimplement mkTrie in terms of it:

mkTrie = foldr addToTrie []

Now how should we define addToTrie? We can think of some simple cases first. When the string we want to add is empty, we simply give back the trie we started with.

addToTrie [] trie = trie

Similarly, if the trie is empty, and we are adding a single letter, we just add a leaf.

addToTrie (c:[]) [] = leaf : []
  where
    leaf = Node (c,True) []

If the trie is empty, and we are adding a longer word, we create a single, unary branching tree from this word, and add it to the trie.

addToTrie (c:cs) [] = tree : []
  where
    tree = Node (c,False) (addToTrie cs [])

If the trie is not empty, and we are adding a single letter, we check to see whether that letter is at the root of the first tree in the trie. If it is, then we can add this letter to our trie by marking the root of the first tree as one at which we can stop (by changing its boolean part to True). If it is not, we ignore it and look deeper into the trie.

addToTrie w@(c:[]) (t@(Node (b,bool) bs) : ts)
  | b == c = (Node (b,True) bs): ts
  | otherwise = t : addToTrie w ts

If the trie is not empty, and we are adding a longer word, we check to see whether that letter is at the root of the first tree in the trie. If it is, then we try to add the rest of the word to this tree. If it is not, we ignore it and look deeper into the trie.

addToTrie w@(c:cs) (t@(Node (b,bool) bs) : ts)
  | b == c =  (Node (b,bool) (addToTrie cs bs)): ts
  | otherwise = t : addToTrie w ts

This function will do what we want! However, it seems as though there is a lot of shared code in the definitions above. We can try to make things more concise. First, let us collapse the two cases where the trie is empty:

addToTrie (c:cs) [] = if null cs
                      then Node (c,True) [] : []
                      else Node (c,False) (addToTrie cs []) : []

We were able to collapse the two case distinctions into one by only checking to see whether the rest of the list was empty on the right hand side. Now we can see more clearly the code duplication. Note that we can make the first case even more similar to the second by rewriting [] as the equivalent addToTrie cs [] (here cs is known to be empty)!

addToTrie (c:cs) [] = if null cs
                      then Node (c,True) (addToTrie cs []) : []
                      else Node (c,False) (addToTrie cs []) : []

Now we see that the only difference between the cases in which the rest of the list is empty or not lies in the boolean value attached to the root, which is True just in case null cs is, and False otherwise. We can thus unify these two cases by making the boolean value at the root the result of the emptiness check.

addToTrie (c:cs) [] =  Node (c,null cs) (addToTrie cs []) : []

This single case replaces the original two cases.

We can now return to the two cases for non-empty tries. We first try to collapse them into one by making the emptiness check for the rest of the input list on the right hand side of the equation. As both cases do the same thing in case the first letter of the input list is different from the node at the root of the first tree in the trie, we only need to test for emptiness in one case.

addToTrie w@(c:cs) (t@(Node (b,bool) bs) : ts)
  | b == c = if null cs
             then (Node (b,True) bs): ts
             else (Node (b,bool) (addToTrie cs bs)): ts
  | otherwise = t : addToTrie w ts

As before, we can make the two cases more similar to each other by replacing bs in the first case of the condition with addToTrie cs bs (where here cs is known to be empty)

addToTrie w@(c:cs) (t@(Node (b,bool) bs) : ts)
  | b == c = if null cs
             then Node (b,True) (addToTrie cs bs) : ts
             else Node (b,bool) (addToTrie cs bs) : ts
  | otherwise = t : addToTrie w ts

Now we see that the only difference between the original two cases lies in whether the boolean value of the root is set to True or continues to be the same as before. We would like to localize the difference to just this point.

addToTrie w@(c:cs) (t@(Node (b,bool) bs) : ts)
  | b == c = Node (b,if null cs then True else bool) (addToTrie cs bs) : ts
  | otherwise = t : addToTrie w ts

We can simplify statements of the form if test then True else b using the boolean connective or: b || test:

addToTrie w@(c:cs) (t@(Node (b,bool) bs) : ts)
  | b == c = Node (b, (bool || null cs)) (addToTrie cs bs) : ts
  | otherwise = t : addToTrie w ts

The full code for the function mkTrie is summarized below:

mkTrie :: Eq a => [[a]] -> Trie a
mkTrie = foldr addToTrie []

addToTrie :: Eq a => [[a]] -> Trie a -> Trie a
addToTrie (c:cs) [] =  Node (c,null cs) (addToTrie cs []) : []
addToTrie w@(c:cs) (t@(Node (b,bool) bs) : ts)
  | b == c = Node (b, (bool || null cs)) (addToTrie cs bs) : ts
  | otherwise = t : addToTrie w ts

  1. You might be more familiar with a notation that places arrows between types. In this notation, the DP type (et)t would be written (e→t)→t. ↩︎

  2. For example, trying to run ['a','b','c'] ++ 'd' will get you the error message Couldn't match expected type '[Char]' with actual type 'Char'. This reports that ‘d’ is a character (of type Char), but the function append only works over lists (of characters in this case). ↩︎

  3. There is a language extension you can set, called ExplicitForAll, which allows you to make the quantification over types explicit. You can then write head :: forall a. [a] -> a. ↩︎

  4. This syntax extension must be enabled via the GADTs language option. ↩︎

  5. The monoid typeclass pre-built into Haskell has an additional operation, mconcat :: [a] -> a, which I have not included below. ↩︎