More agreement in English

Agreement in English is fairly un-interesting; there is just not that much of it. Subject-verb agreement is the most frequent kind, and our model of this agreement as being mediated by the k feature of the AgrS head is pretty adequate.

There are two other loci of agreement in English which are not as obviously amenable to a treatment in terms of movement:

  1. Determiner-Noun number agreement
    • This big bad wolf
    • These big bad wolves
    • Most stylish pants
    • \(\mbox{}^{\ast}\)Most stylish pant
    • Every large group
    • \(\mbox{}^{\ast}\)Every large groups
  2. Expletive there
    • There seems to have been believed to be a problem
    • There seem to have been believed to be problems

We focus here on the first case, and will deal with the second in an upcoming post.

Determiner-Noun agreement

Our current analysis of nominal arguments treats them as headed by determiners:

\(\textsf{the}\mathrel{::}\bullet n. d. k\) \(\textsf{SG}\mathrel{::}\underline{\bullet N}.n\) \(\textsf{student}\mathrel{::}N\)
\(\textsf{no}\mathrel{::}\bullet n. d. k\) \(\textsf{PL}\mathrel{::}\underline{\bullet N}.n\) \(\textsf{teacher}\mathrel{::}N\)

The lexical items in figure 1 treat the number head and the noun head as forming a span (of category nP). We arrived at this analysis because we wanted to decompose our whole-word lexical entries for the plural noun students and the singular noun student. While in the short term this increases the size of our lexicon (from two lexical entries with a total of two features, to three lexical entries with a total of five features), as many nouns participate in a singular/plural alternation, this pays off in the long term (with a hundred nouns, representing singular and plural versions as separate whole-word lexical entries results in two hundred lexical items with a total of two hundred features, whereas decomposing results in fifty-two lexical items with a total of fifty-four features). In essence, there is a fixed overhead cost for the two number lexical entries, and then each additional noun is simply given one number-unspecified lexical entry.

We can ask what to do with pluralia tanta, or nouns which occur only in the plural, like English pants or scissors.1 Given our decompositional methodology, we would not have occasion to decompose scissors or pants at all - they do not cooccur with \(\mbox{}^{\ast}\textit{scissor}\) or \(\mbox{}^{\ast}\textit{pant}\) . We would then continue to have ‘whole-word’ entries for these lexical items.

A word like most does not appear with singular arguments.2, 3 This can be described in terms of a filter that bans the \(\bullet n\) argument of most from being headed by \(\textsf{SG}\). Thomas Graf and I showed that any regular (i.e. finite state) filter can be encoded via the feature system. In this case, we would need to split our current monolithic \(n\) feature into two: \(n_{ PL}\) and \(n_{ SG}\). We could then add two empty lexical items which express that \(n_{ \alpha}\) isa \(n\) for any \(\alpha\): \(\epsilon\mathrel{::}\bullet n_{ SG}. n\) and \(\epsilon \mathrel{::}\bullet n_{ PL}. n\). Most would have the feature bundle \(\bullet n_{ PL}. d . k\), while every would have the bundle \(\bullet n_{ SG}. d. k\). Looking at these categories, we realize that what is happening is that they are transmitting information about one head to another, along a feature checking dependency. This is exactly what agreement does. If we recast this explicitly in terms of agreement, we would describe this in the morphological equation for most: \[\textit{most} = \textsf{most} \textrm{ if }\bullet n\textrm{ is not connected to }\textsf{SG}\] or if we use morphological features for abbreviations: \[\textit{most} = \textsf{most}_{[-sg]}\] This is almost transparently a filter, but it appears to be located near the morphology-syntax interface, instead of in the syntax proper. It is not obvious to me that this is an important distinction, as opposed to a merely notational one.

While most (and every) is invariant in form, and only appears with a particular number, this and these are different expressions of one and the same stem, THIS. In contrast to strudel and strudels, which we decomposed into two-head spans (number and STRUDEL), we treat this and these as just a single head span. I do not know off the top of my head a good data-driven way to motivate this different treatment.4 Of course, the reason is that number in an inherent property of a noun, but an inherited property of a determiner, but this is just a restatement of our analytical decision. \[\textit{these} = \textsf{THIS}\text{ if }\bullet n\text{ is connected to }\textsf{PL}\] \[\textit{this} = \textsf{THIS}\text{ if }\bullet n\text{ is not connected to }\textsf{PL}\]

Of course, agreement is interesting precisely because it operates over long distances. And this determiner-noun agreement functions identically irrespective of the number of words separating the determiner and the noun (or rather number head):

  • This big bad wolf
  • These big bad wolves

We have not given a syntactic treatment of adjectives. Based on the constituency test of one-replacement, it seems that the noun and closest adjective form a constituent to the exclusion of the more distal adjective:

  • There are many bad wolves. This big one[= bad wolf] is particularly hairy.

As (for the sake of argument) adjectives are optional, we want to treat them as not altering the category of expressions they come into contact with, thus an attributive adjective will have the feature bundle \(\bullet n.n\). We can, observing that adjectives may also appear predicatively, decide to decompose attributive and predicative uses of adjectives, obtaining the attributive frame as the span consisting of the lexical entries: \(\textsf{ATTR}\mathrel{::}\underline{\bullet a}.\bullet n.n\) and \(\textsf{bad}\mathrel{::}a\). Adjectival modifiers, such as very can then modify the adjective itself: \(\textsf{very}\mathrel{::}\bullet a. a\).

Now the structure of our DP is such that the number head and the determiner can be arbitrarily distant from one another in terms of the length of the path connecting them. We can still encode our filters in the category system: \(\textsf{ATTR}_{ PL}\mathrel{::}\underline{\bullet a}.\bullet n_{ PL}.n_{ PL}\) and \(\textsf{ATTR}_{ SG}\mathrel{::}\underline{\bullet a}.\bullet n_{ SG}.n_{ SG}\). This encoding makes it clear that the information originating in the number head is simply being passed through the \(\textsf{ATTR}\) head, and should really be treated as agreement.5 Note that our current morphological equations continue to work over longer distances, as (for example) \(\textsf{THIS}\) continues to be connected (via a series of merge edges) to the lexical item \(\textsf{PL}\).


  1. The numberless stems occur elsewhere - pant legs, or scissor statements. And the English pre-nominal measure construction ‘a three foot wide hole’ requires a non-plural form. If we imagine a group of poacher-tailors griping about how difficult it is to cut through elephant hide with scissors, we might imagine them saying of a very tough pachyderm dermis that it was a ‘four scissor hide’, meaning that it required four pairs of scissors to cut through. They could not say that it was a ‘four scissors hide.’ ↩︎

  2. We can debate where this requirement is enforced, whether in the syntax, or in the semantics. I note that bare bones generalized quantifier theory ignores number distinctions and gives the truth conditions of most in terms of sets of singular entities. ↩︎

  3. Both mass and plural count nouns may be the argument of the determiner most. ↩︎

  4. We faced the same problem previously when we treated nouns and verbal inflection differently. ↩︎

  5. Note that it is trivial to exploit the category system to do weird things: having the equally complex \(\textsf{ATTR}_{ PL \rightarrow SG}\mathrel{::}\underline{\bullet a}.\bullet n_{ PL}.n_{ SG}\) and \(\textsf{ATTR}_{ SG \rightarrow PL}\mathrel{::}\underline{\bullet a}.\bullet n_{ SG}.n_{ PL}\) instead of the ones in the main text would ‘flip’ the number polarity in (I think) unattested ways based on the number of adjectives modifying a noun. We cannot do this if we use the agreement system as it has been presented. ↩︎