Subjects and predicates

2020-05-18

We began without analysing the internal structure of sentences at all. We observed that our semantic analysis was insufficient, as there were semantic relationships between sentences that were not reflected in our analysis; [john_sings_and_john_dances] :- john_dances was not a valid entailment according to our analysis; it was analyzed as the claim that one atomic proposition entailed another, different one. To get a better fit between theory and data, we observed that some sentences were syntactically related to others in that they were boolean (and, or and not) combinations thereof.

Now our analysis does correctly predict that [john_sings `and` john_dances ] :- john_dances is valid. It does this by linking the truth of compound sentences to the truth of their components. We still undergenerate, as we are unable to connect the truths of sentences not involving sentence-level boolean connectives with others; [john_dances_and_sings] :- john_dances is not predicted to be valid. Likewise, [john_doesnt_dance] :- not (john_dances) is not a predicted entailment of our theory.

Here the right observation is that sentences are just not the right things to take as primitives - sentences are built up out of arguments and relations. These mispredicted entailments revolve around there being regularities in the internal structure of sentences, which we are currently blind to.

TL/DR

We are now making reference to the internal structure of sentences when deriving predictions about meaning.

You can look at the code here.
You can listen to my comments on this post here.
Our validity predictor is the function isValidSubjPred, which generates our theory’s prediction about whether a given sentence is a truism.
What are some sentences (in English) that we are still unable to make correct predictions about?

The logic of subjects and predicates

The utility of propositional logic is that everything has a single type; all expressions are truth value denoting. Once we begin decomposing sentences into their meaningful parts, we see that these parts mean different kinds of things - that is, they contribute in different ways to the truth value of the sentences in which they appear. We begin by decomposing sentences into subjects and predicates. We introduce a typeclass (called SubjPred, to remind us of subjects and predicates) that allows for unary predicates, and their subjects.¹

class SubjPred exp where
  type Entity exp
  type Truth exp
  pred :: Int -> exp (Entity exp) -> exp (Truth exp)
  subj :: Int -> exp (Entity exp)

The SubjPred typeclass is predicated of a type function exp (for expression). The intent here is that for any type a, the expressions which denote objects of type a are themselves of type exp a. These classes of expressions have the structure of the SubjPred class just in case we can

define a type of entities associated with exp
define a type of truth values associated with exp
define a family of functions pred i from expressions denoting entities to expressions denoting truth values
define a family subj i of expressions denoting entities

We can give examples of generic entity and predicate denoting expressions. We decide to give helpful names to two entity denoting expressions.

john, mary ::
  (SubjPred exp
  , entity ~ Entity exp)
  => exp entity
john = subj 0
mary = subj 1

The type signatures of these Haskell expressions is informative. The Haskell expressions john and mary are of type exp entity, where exp is an instance of the SubjPred typeclass and the type entity is defined to be equal² to the type Entity exp - the type of entities associated with the type function exp. The equality constraint is unnecessary, but I find it makes things look tidier.

We can give helpful names to two predicates as well.

sings, dances ::
  (SubjPred exp
  , entity ~ Entity exp
  , truth_value ~ Truth exp)
  => exp entity -> exp truth_value
sings = pred 0
dances = pred 1

Again, the type signature is informative. The Haskell expressions sings and dances are functions from exp entity to exp truth_value, where exp is an instance of the SubjPred class, entity is equal to Entity exp, and truth_value is equal to Truth exp. Again the equality constraints are unnecessary, but it makes things look nicer.

With these helpful names in hand, we can construct and name other expressions in our semantic language.

john_sings, john_dances, mary_sings, mary_dances, john_doesnt_sing, john_sings_and_dances, john_sings_and_john_dances, john_sings_or_john_doesnt_sing, john_sings_or_doesnt_sing ::
  (SubjPred exp
  , truth_value ~ Truth exp
  , entity ~ Entity exp
  , SentConn (exp truth_value)
  , SentConn (exp entity -> exp truth_value))
  => exp truth_value
john_sings = sings john
john_dances = dances john
mary_sings = sings mary
mary_dances = dances mary
john_doesnt_sing = (not sings) john
john_sings_and_dances = (sings `and` dances) john
john_sings_and_john_dances = john_sings `and` john_dances
john_sings_or_john_doesnt_sing = john_sings `or` john_doesnt_sing
john_sings_or_doesnt_sing = (sings `or` not sings) john

Here the type signature is very informative, reminding us that we are making promises that we (as of this point in the blog post) do not know how to keep! These Haskell abbreviations are of type exp truth_value, where exp is again an instance of the SubjPred class, and the types truth_value and entity are again equal to Entity exp and Truth exp. In order to make use of our boolean connectives at the sentence level, we have also required that the type exp truth_value be an instance of the SentConn class. And in order to use boolean connectives at the predicate level (which was what motivated this post) we require that the type of functions from exp entity to exp truth_value) is also an instance of the SentConn class.

What is important to see here is that, although we have no instances of the SubjPred class, it provides us with a language that we want to combine with the language of the SentConn class so as to give formal versions of the English language sentences that we want as semanticists to analyze. In order to write down these sentences at all, we must tell Haskell what types they must have, and more importantly here what constraints these types must satisfy. This gives us as semanticists a road-map for developing our own analysis. We see that we have exactly three things that we must do:

provide an instance, call it exp, of the SubjPred class
make sure that exp (Truth exp) is an instance of the SentConn class
make sure that exp (Entity exp) -> exp (Truth exp) is an instance of the SentConn class

We turn to this in the next section.

Truth in a situation

In order to evaluate the truth of a sentence in a situation, we need to know what its subjects mean, and what its predicates mean. We intend a subject to denote an individual, and so a situation must provide us with a way of identifying the individual denoted by a particular subject. A subject interpretation is then a function from identifiers to individuals, where our identifiers are as usual integers. A predicate should be true of some (possibly all) individuals, and so will denote a function from individuals to truth values. A situation must provide us with a way of identifying the function denoted by a particular predicate. A predicate interpretation is then a function from predicate identifiers (integers) to functions from individuals to truth values. We represent a situation as a pair of subject and predicate interpretation functions. Because the identity of the set of individuals may vary from one situation to another, we leave it as a parameter of the type.³

-- a name denotes an individual (of type 'e'),
-- and a predicate a set of individuals (of type 'e -> Bool')
data SubjPredSit e = SubjPredSit
  {subjInt :: Int -> e         -- interpreting subjects
  , predInt :: Int -> (e -> Bool)}   -- interpreting predicates

As an expression’s meaning is determined by the situation, our formal reconstruction of the interpretation of an expression will be as a function from a situation to its denotation in that situation. As before, we leave the type of individuals e as a parameter. As expressions are no longer semantically homogenous (some denote individuals, others predicates, still others propositions), we include a parameter ty indicating the type of thing the expression we are looking at denotes.

newtype SubjPredInterp e ty = SubjPredInterp (SubjPredSit e -> ty)

We can interpret subjects and predicates in this meaning space. The \(i^{th}\) predicate will denote whatever it should in the model, as determined by the predicate interpretation function. Similarly for the \(i^{th}\) subject. Formally, we do this by showing that SubjPredInterp e t is an instance of SubjPred, our language for talking about meanings.

instance SubjPred (SubjPredInterp e) where
  type Entity (SubjPredInterp e) = e
  type Truth (SubjPredInterp e) = Bool
  pred i (SubjPredInterp subj) = SubjPredInterp $
    \sit -> predInt sit i (subj sit)
  subj i = SubjPredInterp $
    \sit -> subjInt sit i

We take the type of entities as used in the typeclass instance to be e, the type of entities used in the interpretation, and the type of truth values to be Bool. The interpretation of a predicate pred i is as a function from a subject denotation to a function from a situation to a truth value, determined by seeing whether the function the predicate denotes in that situation holds of the individual the subject denotes.

Now we can compute the interpretation of a term over this vocabulary by providing a situation within which to interpret it.

interpretSubjPred (SubjPredInterp phi) sit = phi sit

This completes the first task we set ourselves previously. Our next tasks are to provide SentConn instance definitions for SubjPredInterp e in both a sentential, and in a predicate domain, so as to be able to use the logical connectives at both the sentence and the VP level. We begin with some special cases. Our sentential domain is Bool, the type of truth values. It is a simple matter to interpret the sentential connectives in the boolean world; we can simply write out the truth tables for the connective.

instance SentConn Bool where
  not True = False
  not False = True

  True `and` True = True
  _ `and` _ = False

  False `or` False = False
  _ `or` _ = True

Similarly, if we think of predicates as denoting sets, we know that we can interpret the logical connectives as the set theoretic operations.⁴

instance SentConn (Set a) where
  not s = complement s

  s `and` t = s `intersect` t

  s `or` t = s `union` t

We discussed briefly in a previous post how to view sets as functions⁵. In the world of predicates, a set of entities can be viewed as a function mapping entities to truth values. Taking the complement of a set would correpond to making a function that returned the negation of the original truth value. Taking the intersection of two sets would correspond to making a function that returned True on an entity just in case it was contained in both sets.

instance SentConn (a -> Bool) where
  not f = \x -> not (f x)
  f `and` g = \x -> f x `and` g x
  f `or` g = \x -> f x `or` g x

Because we have already defined interpretations for our sentential connectives in the domain Bool, we can use them with this interpretation on the right hand sides of the above definitions!

The type SubjPredInterp e ty is essentially a wrapper for a function from a situation to the type ty. If we know how to interpret the logical operators in the domain ty, we can interpret them in the domain SubjPredInterp e ty by waiting until the situation argument has been given, and then performing the relevant action in ty.

instance SentConn ty => SentConn (SubjPredInterp e ty) where
  not (SubjPredInterp phi) =
    SubjPredInterp $ \sit ->
                   not (phi sit)
  SubjPredInterp phi `and` SubjPredInterp psi =
    SubjPredInterp $ \sit ->
                   phi sit `and` psi sit
  SubjPredInterp phi `or` SubjPredInterp psi =
    SubjPredInterp $ \sit ->
                   phi sit `or` psi sit

As we have that both Bool and a -> Bool allow us to interpret the logical connectives, we can interpret the logical connectives in both the SubjPredInterp e Bool and the SubjPredInterp e (e -> Bool) domains.

Validity

Entailment (a special case of which is validity), is defined in terms of quantification over every possible situation. Counting situations, we see that we again have uncountably many, and thus it is impossible to enumerate them and check them all one by one. If we are to have a hope of mechanically verifying entailment, we must exploit the special structure inherent to the problem. When looking at the case of propositional logic, we saw that the structure of that problem was such that a finite number of situations were representatives of the (uncountably) infinite space of all situations. This was because any two situations that differed with each other only on propositional variables not in a particular formula were mapped to the same truth value by the formula.

In the present case of decomposing atomic sentences into subjects and predicates, we see that the same general property permitting us to substitute checking a finite number of situations for the infinite set thereof reemerges. Although we have both subjects and predicates that a situation must determine the meaning of, all that is relevant to the truth of a sentence is the meanings of the subjects and predicates which occur in it.⁶ So we will need to be able to construct all representative situations from a single term. To do this, we will need to determine which subjects and predicates are present in the term. We define a new type, which we will use to record this information.

data WhichSubjPred a =
  WhichSubjPred {whichSubj :: [Int]
                , whichPred :: [Int]}

The intuition behind an object WhichSubjPred subjs preds is that the first list, subjs is the list of subjects in the sentence, and the second list, preds the list of predicates. We need to show how a term can be interpreted as an element of this type.

instance SubjPred WhichSubjPred where
  pred i (WhichSubjPred subs _) =
    WhichSubjPred subs [i]
  subj i = WhichSubjPred [i] []

  type Entity WhichSubjPred = ()
  type Truth WhichSubjPred = ()

Using WhichSubjPred we can compute representatives of all possible situations. A situation is a function assigning an individual to each subject, and a set of individuals to each predicate. We have previously left the type of individuals as a parameter to the situation data type, precisely because we wanted to allow individuals to be different things in different contexts. One might reasonably ask how we expect to compute a ‘generic’ situation which is somehow independent of one’s choice of individuals. The answer is that the nature of individuals does not actually play a role in our semantic theory (at the moment). All that matters is which individuals each predicate is true of. We will take individuals (whatever we may choose them to be) to be represented by numbers. We can imagine that the individuals are arranged in a list, and we refer to them by means of their order in the list. This is particularly convenient, because the names for individuals subj i are also individuated by numbers. We can simply take the \(i^{th}\) name to refer to the \(i^{th}\) individual! In our formal theory, this amounts to taking the subject interpretation part of a situation to be the identity function. A situation must also assign a set of individuals to a predicate. Unfortunately, there is no ‘cheat’ available to us here; we must simply create one situation for every possible way of assigning a set of individuals to a predicate.

mkSituations phi = [mkSit predPairing | predPairing <- allPairings preds (powerSet subs)]
  where
    mkSit predInterp = SubjPredSit id (mkFunctionFromPairs predInterp)
    WhichSubjPred subs preds = phi

In order to determine the validity of a formula, we want to verify whether it is true in all situations. In the previous section we saw how to interpret a formula as a function from a situation to its truth value in that situation, and now we have seen how to interpret a formula as a way to compute (a finite representation of) all possible situations. What remains is to combine these two interpretations.

We first define an instance for the product of two SubjPred typeclass instances.

instance (SubjPred exp1, SubjPred exp2) => SubjPred (Product exp1 exp2) where
  type Entity (Product exp1 exp2) = (Entity exp1, Entity exp2)
  type Truth (Product exp1 exp2) = (Truth exp1, Truth exp2)

  pred i (Product arg1 arg2) = Product (pred i arg1) (pred i arg2)
  subj i = Product (subj i) (subj i)

As the SubjPred typeclass is predicated of a type function, we define a data type Product which has the following intuitive behaviour:⁷

data Product f g (a,b) = Product (f a) (g b)

When we have two denotation domains for SubjPred, we want to combine them by taking their product, and defining the functions pointwise. The type of an Entity is a pair of the entities of the two denotation domains, and similarly for the type of Truth. The interpretation of the term pred i is as a function which takes a pair of entities (one from each domain), and returns a pair of truth values (one from each domain). The interpretation of the term subj i is again as a par of entities, one from each domain.

We need to make sure that the product of two SentConn instances can be interpreted in SentConn as well.

instance (SentConn (m a), SentConn (n b)) => SentConn (Product m n (a,b)) where
  not (Product ma nb) =
    Product (not ma) (not nb)
  Product ma1 nb1 `and` Product ma2 nb2 =
    Product (ma1 `and` ma2) (nb1 `and` nb2)
  Product ma1 nb1 `or` Product ma2 nb2 =
    Product (ma1 `or` ma2) (nb1 `or` nb2)

Now we can combine our truth-in-a-situation and our compute-all-situations interpretation of formulae.

isValidSubjPred phi = all (interpretSubjPred phiInterp) $ mkSituations phiInfo
  where
    Product phiInterp phiInfo = phi

A logician might at this point say “let there be a denumerably infinite set of unary predicates: \(P_{1},P_{2},\dots\) and a denumerably infinite set of individual constants: \(s_{1},s_{2},\ldots\)”. These lists are obtainable in haskell as fmap pred [1..] and fmap subj [1..]. The logician defines the syntax of his language to allow for subjects and predicates to be combined into sentences. The types of our functions in the PredArg typeclass allow for predicates to combine with subjects as functions; we write pred 7 (subj 4). ↩︎
The type class constraint a ~ b is intended to be satisfied when the two types a and b are equal. ↩︎
For example, if our domain of discourse is 1. Bundesliga soccer players our situations would be of type SubjPredSit Bundesligisten, whereas if we are talking about numbers, our situations would be of type SubjPredSit Int. ↩︎
What follows is not quite kosher in Haskell, but puts our previous knowledge in a uniform notation. ↩︎
The characteristic function of a set maps an object to True if and only if that object is a member of the set: charFunc s = \x -> x `member`s. Similarly, given a function from objects to truth values, we can construct from it the set of objects that it maps to True: mkSet f = [ x | x <- allElements; f x == True]. In the world of classical mathematics, where we are concerned with truth and falsity, sets and functions are completely equivalent. Once we move to a more constructive setting, like in the world of programming (or intuitionistic mathematics) we see that we can only move from a function to its classically equivalent set if we can enumerate the type of elements the set contains. ↩︎
This is visible in natural language too. To verify whether John smiled, you don’t need to know who slept, or who ate, or what Bill did. ↩︎
The code given for Fork is not valid in Haskell. The problem is due to decomposing one of the parameters to Fork as a pair. In order to construct a data type with this behaviour, we move to more theoretically interesting corners of Haskell. The actual code used to implement the Fork instance of PredArg is the following. haskell type family Fst (ab :: *) :: * type instance Fst (a,_) = a type family Snd (ab :: *) :: * type instance Snd (_,b) = b newtype ForkCurry f g a b = ForkCurry (f a, g b) newtype Fork f g ab = Fork (ForkCurry f g (Fst ab) (Snd ab)) pattern Product a b = Fork (ForkCurry (a,b)) From the top, the type family and type instance declarations are Haskell’s way of writing signatures and definitions for functions over types. The declarations for Fst are the type level versions of the following code at the more familiar value level: haskell fst :: (a,b) -> a fst (a,_) = a The line type family Fst (ab :: *) :: * is the declaration that Fst is the name of a type function which takes one argument of kind *, and retuns an object of the same kind. (A ‘kind’ is the type of a type.) The line type instance Fst (a,_) = a is the equivalent of fst (a,_) = a but at the level of types. It says that if Fst receives as its argument a pair of types, it will return the first one. For example, Fst (Int,Bool) will return Int. The newtype declarations mirror the functions below: haskell forkCurry f g a b = (f a, g b) fork f g ab = forkCurry f g (fst ab) (snd ab) Of course, as type declarations they define new types, and are not evaluated. Mixed in there however are the type functions Fst and Snd, which do get evaluated. Finally, the line pattern Product a b = Fork (ForkCurry (a,b)) defines a [[https://gitlab.haskell.org/ghc/ghc-wikis/pattern-synonyms][pattern synonym]], which allows us to pretend that one thing were another. The left hand side Product a b can be written as a pattern to match an input against, or as a type constructor, and the compiler expands it to the right hand side. This is useful because it becomes very ugly to write Fork (ForkCurry (Just 3,"Hello")) every time you want to present a value of type Fork Maybe [] (Int,Char). It is much nicer to write Product (Just 3, "Hello"). ↩︎