The logic of unanalysed sentences

2020-04-19

We start our semantic journey at the beginning with the observation that certain sentences are true, and others are false.

For example, “Grass is green” is true, while “Snow is green” is false. At least, this is how we talk about sentences. Much has been written about the nature of truth, and of the relation between language and the world. These are thorny philosophical issues, and I do not wish to commit myself here one way or another. Instead, I will adopt what I take to be an ontologically neutral position, and speak, instead of Truth, of our behaviour. So, our models of meaning will not aim at explaining that sentences like “grass is green” are true, but rather that people judge sentences like “grass is green” to be true.

Reflecting a bit, we realize that we judge not whether a sentence is true tout court, but rather whether it is true in a particular situation: “it is raining” is judged false here today, but might well be judged true here tomorrow, or true now in Tokyo. In the previous sentence I indicated situations by reference to the ‘real world.’ This invites skeptical worries: what if we hallucinate (and think it is raining when it isn’t)? What if we lie (and pretend to judge the sentence true when we really judge it false)? These worries dissolve when we remember that we are not investigating the notion of Truth (how language relates to the real world), but only our judgements of truth in a situation.¹ We are not only interested in modeling (or predicting) our ability to assign sentences truth values in situations, but rather in the linguistic knowledge that allows us to do that. We expect that this linguistic knowledge is leveraged in all sorts of semantic tasks, not just in making truth-value judgements. An other task which seems to draw on our semantic knowledge is judging entailment; whether one sentence follows from (or is contradicted by) some others. We will be primarily interested in modeling these two tasks.

TL/DR

The code for this post is downloadable here, if you would like to run it. Please do not just copy and paste from this post!

Here we define

a language for talking about propositions
a model of our ability to judge sentences true in a situation
- implemented as a function isTrueProp
a model of our ability to judge whether some sentences entail another
- implemented as a function isValidPropSeq

Please:

Try to understand the ideas.
Play with the code.

There’s not much you can do yet beyond typing in things like
```
> isTrueProp (p 1) (charFunc [0,1,2])
> isValidPropSeq ([p 1, p 2, p 3] :- p 4)
```
Try to get a feel for what these do by seeing how their output varies with modifications to their inputs.
Ask questions.

You can post questions or comments on the blog posts anonymously. Ich schreibe das hier auch deswegen auf Englisch, weil ich mich meines geschriebenen Deutsches wegen schäme. Sie können Ihre Kommentare aber sehr gerne auf Deutsch verfassen! (Falls wir uns jemals im Klassenzimmer befinden, ist die Unterrichtssprache ja Deutsch.)

Sentences

We observe that there are infinitely many possible (distinct) sentences. Each sentence appears to have a slightly different influence on our behaviour(al dispositions), and so we would like to have a different name for each sentence, so as to be able to keep track of the different meaning contribution of each. A logician might say at this point “we assume a denumerably infinite set of sentence variables, \(p_{1},p_{2},p_{3},\ldots\)”, and assign each sentence a unique ‘name’ qua sentence variable (and so the sentence “john sings” might be given the name \(p_{1}\), “john dances” the name \(p_{2}\), and so on). But where does this infinite set of sentence names come from? Anytime we want to talk about an infinite object, we must find some way to present it in a finite way. When a logician is being careful, he says how names are constructed out of some finite number of symbols (for example, the letter ‘p’ and the digit ‘0’). The sentence variable \(p_{2}\) would then be viewed as the concatenation of the letter ‘p’ with the digit ‘0’ repeated two times: \(p00\). This problem might look rather trivial (who cares how we come up with the names for sentences?!), but it is actually quite acute when we decide to accompany our theorizing with an implementation. As there are infinitely many sentences, we cannot introduce a new primitive for each of them. Instead, we follow the (careful) logician in treating sentence variables as composed of a symbol, p, together with an integer. One way of doing this involves constructing a new datatype.

newtype Proposition = P Int

Then P 3 is a proposition (something of type Proposition) which the logician might write \(p_{3}\).

We will adopt here a different strategy. Instead of defining a new type of things, we will define a special sublanguage for talking about things. (Instead of defining things, we define a language.) We can use the class system to do this.

class Prop b where
  p :: Int -> b

We will be interested in writing sentence terms (expressions in Haskell). The term p 7 is an expression in Haskell, which we will intuitively understand as representing ‘proposition number 7’, and which a logician might write \(p_{7}\). As an expression in Haskell, it has a type. The type that Haskell assigns to it is Prop b => b, which indicates that can have any type b, which is itself an instance of the class Prop.

Each term represents a single sentence. We can define handy mnemonics for a finite number of sentence terms by using (haskell) variables with suggestive names. Thus we can write john_sings = p 0. Haskell understands this as an assignment of the value p 0 (whatever it may be) to the variable name john_sings. Thereafter, whenever we write john_sings haskell replaces it with its meaning (as defined by us): p 0. This is not a practically useful thing to do - we are just defining another way to talk about the sentence term p 0. It is cute, however. We can do this for any (finite) number of sentence terms, as shown below.

john_sings, john_dances, john_praises_mary, mary_praises_john, mary_sings, mary_dances, john_doesn't_sing, john_sings_and_dances, john_sings_and_john_dances, john_sings_or_john_doesn't_sing, john_sings_or_doesn't_sing :: Prop b => b
john_sings = p 0
john_dances = p 1
john_praises_mary = p 2
mary_praises_john = p 3
mary_sings = p 4
mary_dances = p 5
john_doesn't_sing = p 6
john_sings_and_dances = p 7
john_sings_and_john_dances = p 8
john_sings_or_john_doesn't_sing = p 9
john_sings_or_doesn't_sing = p 10

This entire discussion is based around defining a syntax for the infinitely many sentences of our language. Our syntax defines a simple (but infinite) language of sentence terms of the form p i, for any integer \(i\). Our sentence terms have no meaningful parts, no sentence terms are built from other sentence terms, nothing. They are just independent, atomic, symbols. What can we do, semantically, with such an impoverished language?

We would like to interpret sentences in situations, so as to help explain why people draw the inferences they do. We are interested fundamentally in the notion of entailment, in which our semantic competence manifests itself in an observable way. We define entailment informally in terms of truth in a situation; a sentence \(\phi\) entails another \(\psi\) just in case in any situation in which \(\phi\) is true, so is \(\psi\). We interpret sentences by assigning a meaning to the function symbol p. This symbol must map an integer i to a meaning, the meaning we want to assign to the \(i^{th}\) sentence. This by itself isn’t very interesting; nearly any type could be a meaning space for our propositional terms! It only becomes interesting once we want to do things with propositions. Here are some uninteresting instances of the class Prop.

instance Prop () where
  p _ = ()

In this first case, we have declared that the unit type () to be an instance of the Prop class. In other words, we have shown how to interpret terms in the type (). As there is only one element of the unit type,² also conveniently written (), the only possible value of type () that any function into the unit type could return on an argument is the unit value. So the only way to interpret the function symbol p in the unit type is as the function which maps every integer to the unit value.

instance Prop Char where
  p _ = 'a'

In the second case, we declare that the type Char is an instance of Prop. In other words, we show how to interpret terms in the type Char. In contrast to the case of (), we have many values of type Char (one for every character). There are therefore many possible ways to interpret a term as a character. However I chose (rather arbitrarily) to map every integer to the same character, 'a'. A more exciting function might map every integer to the character that the English name of that integer begins with (so p 7 = 's', and p 8 = 'e' and so on).

instance Prop Int where
  p i = i

In this third example, we interpret our propositional terms in the type Int. Thus, we need to define the function symbol p to be a function from integers (indicating which proposition we are talking about - p 1, p 412, etc) to integers (indicating which integer we interpret that function as). There are many functions from integers to integers (for example, \(f (x) = x^{2}\), or \(g(x) = {sin^{7}(tan(x))} \cdot e^{-x}\)). I chose my favorite one, the identity function.

instance Prop Proposition where
  p = P

We can also declare the our new data type Proposition is an instance of the Prop typeclass. Or in other words, that one way of interpreting a term like p 7 is as the Proposition P 7. In contrast to the previous examples, this at least seems meaningful - we are saying that this bit of language we have defined should be interpreted as talking about these things that we have defined.

More important is a way to combine meaning spaces of propositions. If a and b are two meaning spaces for propositions, then so is their product.

instance (Prop a,Prop b) => Prop (a,b) where
  p i = (p i, p i)

At first glance, it might look like we are defining the function symbol p in terms of itself, leading to an infinite regress. However, the p that we are defining (on the left hand side of the equation) is different from the symbols p that we are using to define it (on the right hand side), each of which is different from the other! Here, the interpretation of the symbol p in the product type (a,b) is as the pair of elements (p i, p i), where the first element of the pair, p i uses the interpretation of p in the type a, and the second element of the pair (also written p i) uses the interpretation of p in the type b. As an example, if we want to interpret p 7 in the product type (Char,Int), we would obtain the value ('a',7). We have simply taken the interpretation of this term in both types independently, and packaged the results together.

Truth in a Situation

In our intended model of propositions, a proposition is either true or false in a situation. The meaning of a proposition must therefore be something which, given a situation, specifies whether it is true or false. We implement this as a function from situations to True or False, wrapped up as a newtype. (PropInt abbreviates ‘propositional interpretation’.)

newtype PropInt s = PropInt (s -> Bool)

extractPropInt :: PropInt s -> s -> Bool
extractPropInt (PropInt phi) = phi

Here, the type variable s is the type of situations, whatever they may be. So PropInt Int is the type of propositional interpretation where situations are integers, PropInt [Int] the type of propositional interpretation where situations are lists of integers, and so on.

Situations

What should a propositional situation be? It had better be something on the basis of which we can determine whether a sentence is true or false! As our propositions currently are not assumed to have any internal structure, the only way to determine whether a sentence is true or false is to simply say whether it is true or false. Thus a situation must simply tell us whether or not each proposition is true.

Accordingly, we define a situation as a function from numbers to truth values, with the understanding that if a situation maps the number i to truth value b, then sentence p i is b in that situation. (Here PropSit abbreviates ‘propositional situation.')

type PropSit = Int -> Bool

A function into the booleans can be viewed as partitioning its domain into two subsets; the one whose elements get mapped to True, and the other where they get mapped to False. Given any subset of a domain, we can view it as the block of the partition which a function maps to True. We can construct such a function, by simply checking whether an element is a member of this subset.

charFunc :: Eq a => [a] -> a -> Bool
charFunc set = \o -> o `elem` set

Thus given a subset of propositions (i.e. a subset of Int) we can construct a situation in which exactly those propositions are true.

Interpreting Propositions

We need to define how exactly our intended model of propostions supports propositional structure. In other words, how the type PropInt PropSit can be made an instance of the Prop typeclass. If a proposition denotes a function from situations to truth values, then how are we to compute the truth value of the \(i^{th}\) proposition in a given situation? The answer is to simply ask the situation what the truth value of the \(i^{th}\) proposition is.

instance Prop (PropInt PropSit) where
  p i = PropInt $ \g -> g i

Defining PropInt as an instance of Prop means that we can interpret terms in the type PropInt.

We can define a function to take a proposition and a situation and to compute the truth of that propositon in that situation.

isTrueProp phi sit = extractPropInt phi $ sit
-- isTrueProp = extractPropInt

This function is our model of language users’ ability to judge a sentence true or false in a situation. There are many deficiencies of this model, but it is a first start. We will improve it throughout the course.

Validity

We have shown how propositions can be interpreted in the type of functions from situations to truth values. This gives us a way of determining, for a given situation, whether a sentence is true or false in that situation. Our characterization of validity, however, refers to truth in every situation. How are we to determine whether a sentence is valid? A naive approach is to interpret the sentence as usual, and then to enumerate all possible situations, verifying that the sentence is true in each of them. This is doomed to failure, as there are infinitely many possible situations³ The failure in question is algorithmic, and so it is perhaps useful to see where it goes wrong. Recall that a situation can be given as a set of integers. Therefore, if we could construct all possible sets of integers, we could turn them into all possible situations (by mapping charFunc over this set).

enumerateAllSetsOfInt :: [[Int]]
enumerateAllSetsOfInt = List.nub $ do
  i <- [0..]                -- for each integer,
  subset <- powerset [0..i] -- for each subset containing numbers
                            -- no greater than that integer
  return subset             -- return that subset

The above function constructs a list of all subsets of (non-negative) integers. We can turn them into situations by applying the function charFunc to each of them.

allSituations :: [PropSit]
allSituations = charFunc `fmap` enumerateAllSetsOfInt

Then we could check whether a sentence term p 7 were valid (true in all situations) by seeing whether it was, in fact, true in all situations.

isValid prop = isTrueProp prop `all` allSituations

This function will construct all of the situations one by one. Each time it constructs a situation, it will see whether the input proposition is true in that situation. If it is not, it will stop, and tell us that, no, the input proposition was not valid. If however the input proposition is true in that situation, it constructs another and does the same. As there are infinitely many possible situations, the function will never finish checking them, and will end up running forever on truly valid inputs. It will, however, eventually find a counter-example to any invalid proposition, a situation in which that proposition is false. This is what is called a semi-decision procedure (for invalid propositions). It is guaranteed to tell you that an invalid proposition is in fact not valid, but it will never return an answer for a valid proposition. Thus, if you don’t get an answer after some amount of time, you cannot know if it is because you have inputted a valid proposition, or just that the situation in which your input proposition is false just hasn’t been constructed yet.

In some cases, the problem of there being infinitely many situations is tamed by the fact that the situations fall into finitely many groups; a situation will either make a given sentence true or false, and what it does on other sentences is completely irrelevant to this fact. Accordingly, we would like to be able to compute which propositional atoms appear in a sentence. We could do this if propositions denote the set of atoms which appear in them.

We define a new type, which is ultimately just another name for a set of integers (represented as a list).

newtype WhichAtoms = WhichAtoms [Int]

extractAtoms :: WhichAtoms -> [Int]
extractAtoms (WhichAtoms atms) = atms

We can interpret our propositional terms in this type, by interpreting a propsoition just as (the singleton set of) its associated integer.

instance Prop WhichAtoms where
  p n = WhichAtoms [n]

As our propositions are atomic, without internal structure, the e.g. seventh proposition just has one part; namely, itself.

Given a proposition of the form p i, if we interpret it in the domain of pairs of WhichAtoms and PropInt, we can use the first component to collect the identifiers of the proposition, using that to create the relevant situations, and then to evaluate the second component on each relevant situation. The relevant situations are exactly those that assign different truth values to the identifier i in the sentence; namely, the characteristic function of the empty set (which makes the sentence False) and the characteristic function of the set containing just i. These two sets are exactly those in the powerset of the set [i].

isValidProp phi = isTrueProp pI `all` relevantSituations
  where
    relevantSituations = [charFunc sit | sit <- powerSet atms]
    (WhichAtoms atms, pI) = phi

As our sentences are unstructured atoms, whose truth is simply a matter of stipulation, there are no validities, in the sense of sentences true in all situations.

Entailment is a relation holding between a set of sentences (the premises) and a sentence (the conclusion). We define a data type (called a Sequent) which contains premises and conclusions separated by a ‘turnstile’, which in logical texts is written \(\vdash\), but in Haskell must be written so that it begins with a colon (:-).

data Sequent a = [a] :- a deriving (Eq,Show,Ord)

The list to the left of the turnstile is the list of premises, and the single object to the right of the turnstile is its conclusion. Intuitively, a sequent of the form \(\Gamma \vdash \phi\) should be read as saying ‘if all premises in \(\Gamma\) are true, then \(\phi\) is true.’

A sequent is valid, that is, the premises of a sequent entail its conclusion, just in case the conclusion is true in all relevant situations, where a situation is relevant just in case it makes all of the premises sentences true.

isValidPropSeq (ant :- cons) = all (isTrueProp consProp) relevantSituations
  where
    relevantSituations = filter makesPremisesTrue allSits
    allSits = [charFunc sit | sit <- powerSet allAtoms]
    makesPremisesTrue sit = all (flip isTrueProp sit) premiseProps
    allAtoms = foldr List.union consAtm $ fmap extractAtoms premiseAtms
    (WhichAtoms consAtm,consProp) = cons
    (premiseAtms,premiseProps) = unzip ant

We can verify that isValidProp is the special case of isValidPropSeq when the antecedent list is empty: as ant == [], we know that

premiseAtms == [] and premisProps == []
WhichAtoms allAtoms == consAtm
makesPremisesTrue sit == True
relevantSituations == allSits
and allSits is just the set of characteristic functions of all subsets of the consAtm atoms

If we simplify these definitions in the definition of isValidPropSeq accordingly, we obtain:

isValidPropSeq ([] :- cons) = all (isTrueProp consProp) relevantSituations
  where
    relevantSituations = [charFunc sit | sit <- powerSet allAtoms]
    (WhichAtoms allAtoms,consProp) = cons

which is virtually identical with isValidProp. We can thus simply define isValidProp in terms of isValidPropSeq:

isValidProp phi = isValidPropSeq ([] :- phi)

In our current setup, where we are not analysing the internal structure of propositions, there are no valid propositions - for every propositional term, it will be true in some situations and false in others - and the only valid sequents are those where the conclusion is one of the premises. This is deeply unsatisfying (because it does not provide a good model of our behaviour), and we will begin to address it later on.

Helper Functions

powerSet takes a set (represented as a list) and computes all possible subsets thereof. Here are two (extensionally equivalent) implementations.

powerSet :: [a] -> [[a]]
powerSet [] = [[]]
powerSet (x:xs) = pxs ++ [x:p | p <- pxs]
  where
    pxs = powerSet xs

An alternative definition of powerSet avoids explicit recursion, making use instead of the pre-built recursion scheme over lists given by foldr. The inner concatMap function replaces the append (++) in the previous definition.

powerSet :: [a] -> [[a]]
powerSet = foldr (\x -> concatMap (\s -> [s, x:s])) [[]]

Worries about lying are not dissolved in this way. These worries are well-placed, but not significantly different from worries about poor data collection, bad experimental design, or experimental error common to all empirical inquiry. In other words, if some of the data we gather are lies, i.e. they do not reflect someone’s actual judgment, that is ‘the same thing’ as if someone in a psychology experiment intentionally presses the wrong button. ↩︎
This is not true. In Haskell, the value undefined is an inhabitant of every type. This is useful, but awkward, and we will just ignore the undefined values in this discussion. ↩︎
In fact, there are more situations than there are integers, and so the infinitude of situations is larger than the infinitude of numbers! A situation, as a function from Ints to truth values, is isomorphic to a set of Ints. It is a famous result (of Georg Cantor) that there are strictly more sets of integers than there are integers! Assume (for a contradiction) that there were not more sets of integers than integers. Then we could enumerate the sets of integers (line them up in an infinite list), such that each set of integers is at some finite position in the list. For each position in the list, we can ask: is the number of that position an element of the list at that position? We construct a new set of numbers, consisting of those numbers where the set at that position does not contain that number. Now this is a perfectly legitimate set of numbers, call it X, and so it must be (by hypothesis) in the list at some position, call it x. Now, is \(x \in X\)? If so, then x is an element of the list at its position, and so by the definition of X, \(x \notin X\). If not, then x is not an element of the list at its position, and so by the definition of X, \(x \in X\). Both of these are contradictory, and so we reject our assumption that there are no more sets of numbers than numbers. ↩︎