A unified semantics for the logical operators

2020-05-26

We have seen that the logical operators and, or, and not can be given a uniform proof-theoretic treatment in the context of tableaux. However, we have had to give them independent interpretations in denotational semantics; they are interpreted truth functionally in a sentential context, but set theoretically in a predicate context. Anticipating somewhat, these same connectives occur in other contexts as well:

DPs: Some student and most teachers
Ds: Some but not all
PPs: On the shelf or near the bookcase
Ps: On or near
Advs: quickly and accurately
TVs: praise and criticize
DTVs: show but not give

In each context in which logical operators occur, they license the same inferences (i.e. the same tableau rules apply), but of necessity will have different denotations (the semantic type of these categories are all different).

This situation is puzzling: the meaning of and clearly should be the same across categories, as it permits the same inferences, as formalized in the tableau system. On the other hand, we are forced to specify its denotation independently for each category. How can it be that the denotations of and are wildly different, yet give rise to the same inference patterns?

The answer we will advance here is that the different domains in which and (and or and not) may denote are all structured in a similar way, and that these words pick out the same operation on this structure in the different domains. One way of thinking about this is that we are taking too fine-grained a perspective by looking at the domains as consisting of objects of various kinds, and that we need to step back and consider how these objects relate to each other; the objects are different, but relations are the same. The situation is familiar in the sciences, where viewing things at the (presumably fundamental) level of quarks is too fine grained to see the real regularities in most disciplines (such as biology, economics, or linguistics).

This post is based on the chapter 7 of the book Mathematical structures in language.

TL/DR

The logical connectives and, or, and not are defined as the meet, the join, and the complement operations of boolean lattices, respectively. This provides a uniform denotational semantics for these operations, despite the very different semantic domains in which they occur.

Lattices

The claim is that truth values and sets of individuals share a similar structure. In particular, that conjunction of truth values and that set intersection (the interpretations of the coordinator and) should play the same role in this structure. In order to make sense of this, we need to find a way to talk about the elements of these domains (the truth values True and False on the one hand and sets of individuals on the other) that abstracts away from the details of these elements. In the domain of sets, perhaps the most salient relation is the subset relation. Set intersection and set union interact with the subset relation in the following ways: for any sets A and B,

$A \cap B \subseteq A$
$A \subseteq A \cup B$

Moreover, any other set $C$ that is a subset of both $A$ and $B$ will also be a subset of $A\cap B$, in other words, $A\cap B$ is the biggest of the sets smaller than both $A$ and $B$. Likewise, $A \cup B$ is the smallest of the sets bigger than both $A$ and $B$.

We can ask: is there any relation $\le$ is the world of truth values that behaves as does $\subseteq$ in this respect? This relation would have to satisfy the following, for any truth values $\phi$ and $\psi$:

$\phi \wedge \psi \le \phi$
$\phi \le \phi \vee \psi$

As there are only two truth values, we can determine what this relation must be by instantiating these equations for every possible choice of truth value for $\phi$ and $\psi$:

$\phi = \mathbf{False}; \psi = \mathbf{False}$: $\mathbf{False} \wedge \mathbf{False} = \mathbf{False} \le \mathbf{False}$ and $\mathbf{False} \le \mathbf{False} \vee \mathbf{False} = \mathbf{False}$
$\phi = \mathbf{False}; \psi = \mathbf{True}$: $\mathbf{False} \wedge \mathbf{True} = \mathbf{False} \le \mathbf{False}$ and $\mathbf{False} \le \mathbf{False} \vee \mathbf{True} = \mathbf{True}$
$\phi = \mathbf{True}; \psi = \mathbf{False}$: $\mathbf{True} \wedge \mathbf{False} = \mathbf{False} \le \mathbf{True}$ and $\mathbf{True} \le \mathbf{True} \vee \mathbf{False} = \mathbf{True}$
$\phi = \mathbf{True}; \psi = \mathbf{True}$: $\mathbf{True} \wedge \mathbf{True} = \mathbf{True} \le \mathbf{True}$ and $\mathbf{True} \le \mathbf{True} \vee \mathbf{True} = \mathbf{True}$

More readably, we have that the relation must satisfy: \[\mathbf{False} \le \mathbf{False}\qquad \mathbf{False} \le \mathbf{True}\qquad \mathbf{True} \le \mathbf{True}\] and not: \[\mathbf{True}\le\mathbf{False}\]

Considering the truth table for implication (figure 1), we see that implication relates truth values in a way parallel to the subset relation.

$A$	$B$	$A \rightarrow B$
$\mathbf{True}$	$\mathbf{True}$	$\mathbf{True}$
$\mathbf{True}$	$\mathbf{False}$	$\mathbf{False}$
$\mathbf{False}$	$\mathbf{True}$	$\mathbf{True}$
$\mathbf{False}$	$\mathbf{False}$	$\mathbf{True}$

Because our goal is to talk about the structure common to truth values and to sets of individuals, we will use the generic $\le$ symbol for the relation of subset in the domain of sets, and for implication in the domain of truth values. Furthermore, we will write $\wedge$ (usually called ‘meet’) for set intersection as well as for boolean conjunction, and $\vee$ (usually called ‘join’) for set union as well as for boolean disjunction.

The algebraic structure which we have here observed is called a lattice (Ger: Verband). A natural way of defining a structure in Haskell, which other things may instantiate, is by using the type class system.

class Lattice l where
-- meet and join should be:
-- 1. commutative
--    a `meet` b == b `meet` a
--    a `join` b == b `join` a
-- 2. associative
--    a `meet` (b `meet` c) == (a `meet` b) `meet` c
--    a `join` (b `join` c) == (a `join` b) `join` c
-- 3. absorbing
--    a `join` (a `meet` b) == a
--    a `meet` (a `join` b) == a
  meet :: l -> l -> l
  join :: l -> l -> l

To be a true lattice, the operations need to satisfy certain restrictions, namely, that they are commutative, associative, and that they are absorbing. It follows from the absorbtion laws that both meet and join operations are idempotent: a `meet` a == a and a `join` a == a.¹

The Haskell type class makes no mention of the order $\le$ between elements. As with many other structures in mathematics, a lattice can either be defined in terms of the order, or in terms of the meet and join operations. Both definitions end up being equivalent, both in the sense of defining the same objects, and in the sense of being able to recover each from the other. For example, given the lattice class above, one element $x$ can be defined to be less than or equal to another $y$ just in case x `meet` y == x.² Similarly, the meet of two objects $x$ and $y$ can be defined, using the order, as the unique object $x \wedge y$ which is less than both, but greater than any other object less than both (the greatest of the lower bounds for $x$ and $y$).

Lattice Instances

There are numerous types that we can impose lattice structure on. For instance, the type (), with just a single inhabitant is a lattice in this sense. As there is but one element inhabiting this type, both meet and join operations must output this element.

instance Lattice () where
  meet _ _ = ()
  join _ _ = ()

More interestingly, the type Bool is a lattice in this sense, with greatest lower bound operation truth value conjunction, and least upper bound operation truth value disjunction.

instance Lattice Bool where
  meet = (&&)
  join = (||)

We cannot give the lattice instance for sets, as they require the constraint that their objects be ordered, which does not fit into our lattice typeclass. Ignoring this for the moment, the lattice instance for sets would look like the following:

instance Lattice (Set a) where
  meet = intersection
  join = union

Recalling that sets can be exchanged with their characteristic functions, we can give a lattice instance for sets qua characteristic functions.

instance Lattice (a -> Bool) where
  f `meet` g = \x -> f x && g x
  f `join` g = \x -> f x || g x

The definition for meet says that the meet of two characteristic functions is the characteristic function that maps an element to true just in case both functions do. We note that we have used boolean conjunction and disjunction to define these functions, which are just the meet and join operations of the Bool instance of a lattice. We can rewrite our characteristic function instance above, replacing (&&) by meet and (||) by join.

instance Lattice (a -> Bool) where
  f `meet` g = \x -> f x `meet` g x
  f `join` g = \x -> f x `join` g x

Now we can see that we are not actually making reference to any property of the type Bool in the above, other than that it has lattice structure. We can generalize our lattice instance again, this time by replacing the explicit mention of Bool by any lattice type.

instance Lattice b => Lattice (a -> b) where
  f `meet` g = \x -> f x `meet` g x
  f `join` g = \x -> f x `join` g x

We can give instance definitions for types one at a time, showing how they can be treated as lattices, as we have done for Bool and (). This way of proceeding does not allow for generalization; each type is given an instance declaration independently of any other. A more systematic approach exploits the recursive structure of types, saying how to derive an instance declaration for a new type based on instance declarations for other types. This approach only works in case the instancehood of a complex type is in fact derivable from its parts. Fortunately, as we have seen, in the case of lattices, this is indeed so. If a type has lattice structure, than so does the type of functions into that type, with the lattice opperations being defined pointwise.

Similarly, given two types with lattice structure, the type of pairs of those types has it as well.

instance (Lattice b,Lattice c) => Lattice (b,c) where
  (b1,c1) `meet` (b2,c2) = (b1 `meet` b2, c1 `meet` c2)
  (b1,c1) `join` (b2,c2) = (b1 `join` b2, c1 `join` c2)

Boolean Lattices

The types of truth values and of sets have more structure than is made explicit by their being lattices. In particular, we can negate truth values, and take the complement of a set. We would like again to unify these two concepts, and we proceed once more by trying to characterize them without making explicit reference to the objects of our domains. Starting again with sets, we see that the complement of a given set has a special property with respect to the set it is the complement of: the union of a set with its complement is the universe, and the intersection of a set with its complement is the empty set.

$A \cup \overline{A} = E$
$A \cap \overline{A} = \emptyset$

Indeed, the combination of these two properties uniquely characterizes the complement of a set. The universal set and the empty set are also special, in that they can be defined in terms of the order relation:

every set is a subset of the universal set
the empty set is a subset of every set

Translating this into the boolean domain, we must find truth values with these properties. As there are only two truth values, we can simply try all combinations. Doing this, we find that the truth value $\mathbf{True}$ is parallel to the universal set, and $\mathbf{False}$ is parallel to the empty set. And indeed, we find that the negation operation satisfies the correlates of the set equations above.

$\mathbf{True} \vee \mathbf{False} = \mathbf{True}$
$\mathbf{True} \wedge \mathbf{False} = \mathbf{False}$

The elements $\mathbf{True}$ and the universal set are ‘bigger’ than every other object of their kind, and the elements $\mathbf{False}$ and the empty set are ‘smaller’ than every other object of their kind. It is common to call elements like $\mathbf{True}$ and the universal set the top elements, and those like $\mathbf{False}$ and the empty set the bottom elements.

The types of truth values and of sets have additional structure as well: their meet and join operations are distributive. A boolean lattice (Ger: boolescher Verband)is a distributive lattice with a complement operation, and with designated top and bottom elements.

class Lattice b => Boolean b where
-- meet and join must in addition be
-- 1. distributive
--    a `join` (b `meet` c) == (a `join` b) `meet` (a `join` c)
--    a `meet` (b `join` c) == (a `meet` b) `join` (a `meet` c)
-- 2. complemented
--    a `meet` cmp a == bot
--    a `join` cmp a == top
  top :: b
  bot :: b
  cmp :: b -> b

Although the lattice operations (meet and join) were defined for pairs of elements, we can define them over (potentially infinite) sets of elements as well. This is possible because these operations are commutative (meaning they do not care about the order of their arguments), idempotent (meaning that they do not care about how many times they see a particular argument), and associative (meaning that they do not care about when they see a particular argument).

infMEET, infJOIN :: Boolean b => [b] -> b
infMEET = foldr meet top
infJOIN = foldr join bot

All of the lattice types we have discussed are also boolean lattices. The type () has boolean lattice structure where top and bot are both (), and cmp is the identity function.

instance Boolean () where
  top = ()
  bot = ()
  cmp _ = ()

The type Bool of truth values is boolean where top is True, bot is False, and cmp is logical negation.

instance Boolean Bool where
  top = True
  bot = False
  cmp = not

As was the case with lattices, booleanness is inheritable. With function types, the top function maps everything to the top element of its codomain, the bot function maps everything to the bot element of the codomain, and cmp maps a function to one which takes the cmp of its output.

instance Boolean b => Boolean (a -> b) where
  top = \x -> top
  bot = \x -> bot
  cmp f = \x -> cmp (f x)

The product types inherit booleanness componentwise.

instance (Boolean b,Boolean c) => Boolean (b,c) where
  top = (top,top)
  bot = (bot,bot)
  cmp (b,c) = (cmp b, cmp c)

This allows us to write terms like top `meet` (top `join` bot), with type Boolean b => b.

It’s all just sets after all

A fundamental theorem about boolean algebras has as a consequence that

every finite boolean algebra is isomorphic to the powerset of its atoms

An element $x$ of a boolean algebra is atomic iff

it is not the bottom element
the only element it is greater than is the bottom element

This fundamental theorem tells us as well that

every element in a finite boolean algebra is uniquely characterized by the atoms it is greater than

We see this immediately for the top and bottom elements; the top element is greater than every atom, the bottom element is greater than no atom.

To get an intuitive understanding of why this might be the case, we can consider how to represent some of the boolean lattices we have seen as powersets.

Consider the one element boolean algebra (as given by the Haskell type ()). It has no atoms (because the only element it contains is the bottom element), and therefore the set of its atoms it the empty set. The powerset of the empty set is the set containing a single element, the emptyset ($\wp(\emptyset) = \{\emptyset\}$). This one element set is isomorphic to the original one element set.

Consider now the two element boolean algebra of truth values (as given by the Haskell type Bool). It has just one atom, $\mathbf{True}$. The powerset of this one element set contains the one element set and the emptyset $\wp(\{\mathbf{True}\}) = \{\emptyset,\{\mathbf{True}\}\}$. Setting $\mathbf{False} \cong \emptyset$ and $\mathbf{True} \cong \{\mathbf{True}\}$, we see that $\emptyset \subseteq \{\mathbf{True}\}$ just as $\mathbf{False} \le \mathbf{True}$, and that moreover the set theoretic operations of intersection, union, and complement perfectly mirror the boolean conjunction, disjunction and negation of truth values.

More generally, we can turn any finite boolean algebra into a powerset algebra by associating each element of the original boolean algebra with the set of atoms it is greater than (or equal to). We can see immediately that this preserves the complement operation: if you are greater than one set $S \subseteq At$ of atoms, your complement must be greater than the complement of $S$. If your complement and you shared some atom, say $a$, then your intersection with your complement would also be greater than (or equal to) this atom, but this is impossible, because the intersection of any element with its complement is the bottom element of the algebra. If your complement does not contain some atom in the complement of $S$, then your join with your complement would not be greater than this atom, which is again impossible because the join of any element with its complement is the top element.

We can prove this by substituting equalities for equalities: \[ \begin{align}\tag{absorbtion, with $b = a \mathrel{\tt join} a$} a & = a \mathrel{\tt join} (a \mathrel{\tt meet} (a \mathrel{\tt join} a)) \\\ \tag{absorbtion with $b = a$} & = a \mathrel{\tt join} a \end{align} \] ↩︎
The use of the function (==) from the Eq typeclass requires that the type objects we are manipulating be an instance of this class. ↩︎

\(A\)	\(B\)	\(A \rightarrow B\)
\(\mathbf{True}\)	\(\mathbf{True}\)	\(\mathbf{True}\)
\(\mathbf{True}\)	\(\mathbf{False}\)	\(\mathbf{False}\)
\(\mathbf{False}\)	\(\mathbf{True}\)	\(\mathbf{True}\)
\(\mathbf{False}\)	\(\mathbf{False}\)	\(\mathbf{True}\)