Finite State Tree Transducers
1 Tree Transducers
A Tree transducer implements a function (or relation) from trees to trees. In contrast to strings, which have a left/right symmetry, trees have a very different structure depending on whether one navigates them top-down or bottom-up. We will focus on bottom-up transductions here.1
Recall that we are using unranked trees; trees where the number of daughters that a node may have is not determined by its label. The reason for this is simply because it accords with linguistic practice; a VP node might have one daughter (in case of an intransitive V) or two daughters (in case it has a specifier), or three daughters (in some older analyses prior to the widespread adoption of a binary branching hypothesis). A simple inductive definition of such objects can be given as follows: a tree consists of a node label followed by a list of trees.
data Tree a = Node a [Tree a] deriving (Show,Eq)
According to this definition, every node has a list of daughter
subtrees. A leaf is a Node
with an empty list of daughter
subtrees.
We begin with the concept of a tree homomorphism. A homomorphism can be thought of as a transducer with just one state. Thus, if a transducer is homomorphic, it cannot use the context of a node in any way to influence its translation. A homomorphism is therefore fully determined by its kernel – a function which specifies how to interpret each node in isolation. Note that a node is interpreted as a way of combining the interpretations of its daughters. So if a subtree is interpreted as a tree, a node must be interpreted as a way of combining a list of trees into another tree.
type Kernel a b = a -> [b] -> Maybe b type Hom a b = Tree a -> Maybe b extend :: Kernel a b -> Hom a b extend ker (Node a ts) = traverse (extend ker) ts >>= ker a {- traverse :: (a -> Maybe b) -> [a] -> Maybe [b] if any element of the input list is mapped to @Nothing@, then the output is @Nothing@. otherwise, behaves as @map@, ignoring the @Just@ wrapped around the outputs: traverse f [] = Just [] traverse f (b:bs) = case f b of Nothing -> Nothing Just x -> case traverse f bs of Nothing -> Nothing Just xs -> Just (x : xs) (>>=) :: Maybe a -> (a -> Maybe b) -> Maybe b If the input is @Nothing@, so is the output. Otherwise, applies the function inside of the @Just@: Nothing >>= f = Nothing Just x >>= f = f x -}
Note that each node in isolation must be interpreted as an operation that combines the translations of its daughters into some new output. Specifying (and then varying) these operations will give us a wide variety of different tree transducer models.
1.1 Bottom up tree acceptors
We begin however by looking not at transducers, but instead at acceptors! Acceptors must assign a tree to one of some finite number of categories, and on that basis decide to accept or reject the tree. Thus if subtrees are 'translated' into categories (or states), nodes must be functions from lists of categories to categories. We derive this function from a list of 'rules'.
type AccRule a b = ((a,[b]),b) mkAccKernel :: (Eq a, Eq b) => [AccRule a b] -> Kernel a b mkAccKernel rs a bs = lookup (a,bs) rs {- lookup :: a -> [(a,b)] -> Maybe b The function @lookup@ takes an object and a list of pairs, and searches for a pair whose first component is the object we were given. In case one is found, the second component is returned (wrapped in a @Just@). If no such pair is present, @Nothing@ is returned. -}
While we could simply define a kernel directly (instead of via a list of rules), deriving them from rules restricts the kinds of homomorphisms we can define, to just those definable as an automaton. Functions in Haskell can have recursion, can pattern match, can do conditional branching, etc. None of these are part of how we want to define automata to work.
Given a designated subset of 'final' states, a state homomorphism can be used to accept or reject trees.
mkAcceptingHom :: (Eq a, Eq states) => [AccRule a states] -> [states] -> Hom a Bool mkAcceptingHom rs finals = fmap (`elem` finals) . extend (mkAccKernel rs) {- The function @fmap@ applies a function to a @Maybe@ argument: fmap f Nothing = Nothing fmap f (Just x) = Just (f x) -}
It is inconvenient, however, to have two ways of rejecting an input: Nothing
means that the input was not successfully parsed, while Just False
means that it was parsed, but rejected. We can turn a Hom a Bool
into a function delivering a truth value (i.e. unwrapping the Maybe
) by using the function or
:
acceptOrReject :: Hom a Bool -> Tree a -> Bool acceptOrReject h = or . h {- or :: Maybe Bool -> Bool or (Just b) = b or Nothing = False -} mkAcceptor :: (Eq a, Eq states) => [AccRule a states] -> [states] -> Tree a -> Bool mkAcceptor rs fins = acceptOrReject $ mkAcceptingHom rs fins
Of course, a (bottom up) tree automata can also be characterized as a machine which moves from leaves to root.
data BUTA states alph = BUTA {stateSet :: [states] , finals :: [states] , alphabet :: [alph] , transitions :: alph -> [states] -> Maybe states}
The transition function of such a machine can be extended so that it operates over entire trees. The somewhat too verbose code is intentionally so-written to highlight similarities with later, necessarily verbose code.
deltaBUTA :: (alph -> [states] -> Maybe states) -> Tree alph -> Maybe states deltaBUTA d (Node a ts) = do -- first, run the machine on the daughters states <- traverse (deltaBUTA d) ts -- then, determine the state of the tree based on the states of its daughters s <- d a states return s
The behaviour of such a machine (accepting or rejecting an input) depends on whether the state the machine associates with the input is a final state.
acceptBUTA :: Eq states => BUTA states a -> Tree a -> Bool acceptBUTA bu = any (`elem` finals bu) . deltaBUTA (transitions bu) {- @any@ applies a property to each element of a structure, and returns @True@ just in case the property was true of some element. In our case, we have an object of type @Maybe states@, which may or may not contain an object of type @states@. So @any@ will be true just in case the object is of the form @Just s@, and @s@ is an element of the final state set. @any f@ is equivalent to @or . fmap f@ -}
We can construct a tree automaton from a set of acceptor rules. When so doing, the transition function is simply the kernel so derived.
mkBUTA :: (Eq states, Eq alph) => [AccRule alph states] -> [states] -> BUTA states alph mkBUTA rs fins = BUTA allStates fins letters (mkAccKernel rs) where (allLettersDups,allStatesDups) = unzip $ fmap (\((a,bs),c) -> (a,c:bs)) rs letters = L.nub allLettersDups allStates = L.nub $ concat (fins : allStatesDups)
1.2 Bottom up tree transducers
Next we look at functions that translate each subtree of a tree as a tree. Then each node of a tree must be a function from (a list of) trees to a tree. However, not just any function will do! The functions we wish to use should not inspect their inputs – this would be smuggling in the ability to be influenced by context! Instead, they should simply arrange their input trees in a uniform way into a larger tree. This amounts to associating with each node a tree with holes at the leaves, and plugging the inputs into the holes. This kind of object, a tree with holes at the leaves, is called a tree context.
data Context a = NodeC a [Context a] | Hole Int deriving (Show,Eq)
A tree context is implemented must as a tree was, except that there is
the possibility to be a named Hole
; the name is given as a number.
Holes can never have any children beneath them, and so they are de
facto leaves. The point of a context is to derive an operation over
trees from it: put the input trees into the appropriate holes. We
define a function, tree_substitution
which turns a context into an
operation on lists of trees.
tree_substitution :: Context a -> [Tree a] -> Maybe (Tree a) -- when you have a hole with name i, put the ith element from the list in the hole, if one exists tree_substitution (Hole i) ts = get ts i -- if you have a tree, keep the node label, and substitute the holes in its daughters tree_substitution (NodeC a cs) ts = do dtrs <- flip tree_substitution ts `traverse` cs return (Node a dtrs)
It will be convenient to have a way of extracting the contents of a tree context.
getNodes :: Context a -> [a] getNodes (Hole _) = [] getNodes (NodeC a cs) = a : concat (getNodes <$> cs)
1.2.1 Tree Homomorphisms
A tree homomorphism is a finite association of tree contexts with
node/number of daughter pairs. The tree context associated with a
node/number of daughters pair should have the same number of holes as
that node has daughters. This invariant is not easily enforced at the
type level in Haskell
(although it would be in Coq
or Agda
), and
so we will not enforce it. We define a rule as a pair of a node
label/integer pair and a tree context; it is well-formed iff the
number of distinct holes in the tree context is less or equal to the
integer.
type HomRule a b = ((a,Int),Context b) -- intended interpretation of a rule (a,i) -> c is that c is a context -- whose holes' names are all less than i.
Given a set of such rules, we can create a kernel.
mkHomKernel :: Eq a => [HomRule a b] -> Kernel a (Tree b) mkHomKernel rules a bs = do t <- lookup (a,len) rules t' <- tree_substitution t bs return t' where len = length bs
In contrast to the case of accepting homomorphisms, of type Hom a
Bool
, we can create a homomorphism of type Hom a (Tree b)
by extending such a
kernel.
mkBUHom :: (Eq a, Eq b) => [HomRule a b] -> Hom a (Tree b) mkBUHom = extend . mkHomKernel
1.2.2 Tree Transducers
A (bottom-up) transduction can be viewed simply as a homomorphism which can take a limited (i.e. finite) amount of contextual information into account when replacing node labels with contexts. Such contextual information is called a state. We can define bottom up transducers given a set of modified rules, which instead of caring about the number of daughters had by a node, cares rather about the states of the machine after processing each of the daughter, and a list of final states.
type BURule inputNode state outputNode = ((inputNode,[state]),(state,Context outputNode))
Another way of thinking about this is that, if we modify our data structures from trees to pairs of states and trees, then our homomorphisms can implement more kinds of functions over trees.
mkBUKernel :: (Eq i,Eq s) => [BURule i s o] -> Kernel i (s,Tree o) mkBUKernel rules a bs = do (s,c) <- lookup (a,states) rules t <- tree_substitution c trees return (s,t) where (states,trees) = unzip bs
Our homomorphisms can be seen as the special case where the type of
states is ()
.2 If we have a set fs
of final
states, we can define a function which implements the transduction
from trees to trees of a set of rules.
transduce :: Eq s => [s] -> Hom a (s,b) -> Hom a b transduce fs h t = do (s,t') <- h t guard (s `elem` fs) return t' {- guard :: Bool -> Maybe () @guard@ takes an expression and evaluates its truth. If it is @True@, computation proceedes. If it is @False@, then @Nothing@ is returned. -}
Given a kernel, we can use transduce
to create a transducer. We
wrap this up into a single function, which takes a set of rules and a
final state, and produces a tree transduction
mkBUTrans :: (Eq a,Eq s) => [BURule a s b] -> [s] -> Hom a (Tree b) mkBUTrans rs fins = transduce fins $ extend $ mkBUKernel rs
It is however convenient to have a data structure more canonically representing the machine-perspective on transductions.
data BU s i o = BU {states :: [s] , iAlph :: [i] , oAlph :: [o] , finalSts :: [s] , trans :: i -> [s] -> Maybe (s,Context o)}
We see that we can define a function to obtain a bottom up machine
from a BURule
.
mkBU :: (Eq i,Eq s,Eq o) => [BURule i s o] -> [s] -> BU s i o mkBU rs fins = BU sts iAl oAl fins tr where (stsDups,iAlDups,oAlDups) = unzip3 $ map (\((i,ds),(s,c)) -> (s:ds,i, getNodes c)) rs sts = L.nub $ concat $ stsDups iAl = L.nub iAlDups oAl = L.nub $ concat oAlDups tr = curry (flip lookup rs)
Of course, what is of import is whether the bottom-up machine so
obtained is related in some way to the homomorphism derived from the
list of BURule
's. To make sense of that we need to create some
notion of what a bottom-up machine does. We can define functions that
map bottom-up machines to transductions from trees to trees.
deltaBU :: (i -> [s] -> Maybe (s,Context o)) -> Hom i (s,Tree o) deltaBU d (Node a ts) = do (states,trees) <- unzip <$> traverse (deltaBU d) ts (s,ctxt) <- d a states t <- tree_substitution ctxt trees return (s,t)
The claim is that, for any list rs
of rules, any tree t
, and a
list fs
of final states, transduce fs $ deltaBU $ d $ mkBU rs fs == transduce fs $ mkBUkernel rs
.
2 Multi transducers
As noted previously, the kind of objects that a transducer associates with subtrees can be changed. Here we explore allowing not just trees, but lists of trees to be the output of a transduction.3 (Multi-homomorphisms will not be treated separately, we can obtain them as the special case of a multi-transducer where states are of unit type (and thus record no information).) A multi-rule is much the same as before, except that it outputs a list of contexts. Another 'difference' (really, making explicit what was implicit before) is that the previous assumption that each state corresponds to just one translated tree no longer holds. Thus we need to indicate along with each state the number of translated trees it corresponds to.
type MultRule i s o = ((i,[(s,Int)]),(s,[Context o]))
A multirule will have the form:
\(f(q_{1}(t_{1,1},\ldots,t_{1,k_{1}}),\ldots,q_{n}(t_{n,1},\ldots,t_{n,k_{n}}))
\rightarrow q(c_{1},\ldots,c_{k})\) where f is the node of the tree
being processed, the \(q_{i}\) are the states of the transducer after
processing the \(i^{\mbox{th}}\) daughter, the \(t_{i,j}\)s are the
trees associated with the \(i^{\mbox{th}}\) state, and the \(c_{i}\)
are the contexts associated with the output state. Such a rule would
be rendered in the following way as our Haskell type MultRule
:
((f,[(q1,k1)..(qn,kn)]),(q,[c1..ck]))
Given a list of multi-rules, we can construct a Kernel
.
mkMultKernel :: (Eq i,Eq s,Eq o) => [MultRule i s o] -> Kernel i (s,[Tree o]) mkMultKernel rules i bs = do (sn,cs) <- lookup (i,ss) rules ts <- flip tree_substitution (concat trees) `traverse` cs return (sn,ts) where (states,trees) = unzip bs ss = zip states $ fmap length trees
We can again use transduce
to allow us to filter out those tree tuples which are associated with the wrong states
mkMultTrans :: (Eq a,Eq s,Eq b) => [MultRule a s b] -> [s] -> Hom a ([Tree b]) mkMultTrans rs fins = transduce fins $ extend $ mkMultKernel rs
A multi-transducer is also much the same as our previous transducer. The difference is of course the type of the transition function, which is of course quite similar to the type of a multi rule.
data MultBU s i o = MultBU {statesMult :: [s] , iAlphMult :: [i] , oAlphMult :: [o] , finalStsMult :: [s] , transMult :: i -> [(s,Int)] -> Maybe (s,[Context o])}
As expected, we can construct a multi-transducer from a specification of the rules, and the final states.
mkMultBU :: (Eq i,Eq s,Eq o) => [MultRule i s o] -> [s] -> MultBU s i o mkMultBU rs fins = MultBU sts iAl oAl fins tr where (iAlDups,stsDups,oAlDups) = unzip3 $ fmap (\((i,ss),(s,ctxs)) -> (i,s:(fmap fst ss),ctxs >>= getNodes)) rs sts = L.nub $ concat $ stsDups iAl = L.nub iAlDups oAl = L.nub $ concat oAlDups tr = curry (flip lookup rs)
3 Macro transducers
Using trees (or lists thereof) as the output data structure entails
that nodes in the translations of different daughters must be
independent (i.e. they must not dominate each other) in the final
output structure. This can sometimes be too restrictive. For
example, if we are transducing a tree into a string (qua monadic
tree), we would like to be able to have nodes (i.e. letters) from one
subtree dominate nodes from another one. The obvious solution to this
problem is to change the output data structure. We would like a data
structure which can be modified not only at the root (like a tree),
but also at its frontier. Luckily, we have been using one such:
Context
. If our trees should be
transduced into tree contexts, then each node must be an operation
which takes contexts as arguments, and produces contexts; a context
context.
data CContext a = NodeCC a [CContext a] | CHole0 Int | CHole1 Int [CContext a] deriving (Show,Eq)
If we are going to be manipulating contexts, we will need to replace
the holes of our contexts not with trees, but with other contexts.
This operation is called context composition. It is, as code,
identical to tree_substitution
, but for the constructors on the
right hand sides; in tree_substitution
we return a Node a dtrs
whereas here we return a NodeC a dtrs
.
ctxt_composition :: Context a -> [Context a] -> Maybe (Context a) ctxt_composition (Hole i) g = get g i ctxt_composition (NodeC a cs) g = do dtrs <- flip ctxt_composition g `traverse` cs return (NodeC a dtrs)
We will also need to perform the analogue of tree_substitution
, but
where we are substituting contexts for the internal holes of a
context context! This operation makes use of context composition,
so as to put the contexts which are the daughters of an internal hole
inside of the context we are filling the hole in with.
cCtxt_substitution :: CContext a -> [Context a] -> Maybe (Context a) -- if you are a leaf hole, that stays the same cCtxt_substitution (CHole0 i) _ = Just (Hole i) -- if you are a node, just do the substitution on all daughters cCtxt_substitution (NodeCC a ccs) cs = do dtrs <- flip cCtxt_substitution cs `traverse` ccs -- doing the substitution on all daughters return (NodeC a dtrs) -- if you are an internal hole, replace it with the appropriate -- context (obtained via @get cs i@). Because the daughters of this -- guy are still @CContext@s, we need to turn them into @Contexts@ by -- substituting into the open @CHole1@s the appropriate contexts from -- @cs@. Then we can compose the contexts. cCtxt_substitution (CHole1 i ccs) cs = do ctx <- get cs i -- getting the appropriate context to replace the internal hole ctxts <- flip cCtxt_substitution cs `traverse` ccs -- turning the daughters of the internal hole into normal contexts ctxt_composition ctx ctxts
A Macro Tree Transducer rule is much the same as our previous transducer models (in particular, the bottom-up one); this time however nodes are associated with tree context contexts.
type MacroRule i s o = ((i,[s]),(s,CContext o))
Given a set of macro rules, we can construct a Kernel in the same way we always do.
mkMacroKernel :: (Eq i,Eq s) => [MacroRule i s o] -> Kernel i (s,Context o) mkMacroKernel rules a bs = do (s,cctx) <- lookup (a,states) rules ctx <- cCtxt_substitution cctx ctxts return (s,ctx) where (states,ctxts) = unzip bs
This allows us to construct a macro transduction using transduce
:
mkMacroTrans :: (Eq i,Eq s) => [MacroRule i s o] -> [s] -> Hom i (Context o) mkMacroTrans rs fins = transduce fins $ extend $ mkMacroKernel rs
Footnotes:
There is a free ebook 'Tree Automata Techniques and Applications' which covers tree automata (and transducers) in wonderful detail.
Note that the integer
component of the type HomRule
is obtainable as the length of the
list of units in the type BURule
.
The resulting tree transducer model is called multi-bottom-up tree transducers.