Colloquium
Course Information
Time | Mo: 0915-1045 (H1 5.16) |
Module | N/A |
Instructor | Greg Kobele (GWZ H1 5.11) |
Course Log
We discussed how to recognize the language aⁿbⁿ on a state machine. We saw that the infinite state machine we would write naively (following the general schema for finite state machines) could be factored into a simple two state FSA which was able to access a counter (or box) of indistinguishable pebbles.
No class: Dies Academicus
In our construction last time, we appealed to the ability to determinize machines, and to construct a machine recognizing the intersection of the languages of two given ones. We presented the product construction on finite state automata, and showed how this could be used to prove closure under union and intersection. We also saw how determinization (i.e. a power-set construction) was based on a similar idea.
We discussed the potential exponential size increase, and saw that this was unavoidable in the worst case (based on the family of languages Lᵢ, which consist of all words whose i-th letter is 1). We discussed the role of succinctness arguments in linguistics (as a rough proxy for learnability), and talked about the broader landscape of succinctness results in formal language theory.
We discussed how to construct a finite state automaton recognizing the same language as an open MSO formula. Crucially, free variables were incorporated into the alphabet of the machine: if variables x, y, and z are free, then the alphabet of the machine is quadruples consisting of an alphabet symbol, and three booleans indicating whether each variable is referring to that instance of that letter.
We showed how to move from a finite state automaton to a MSO formula describing the same language. In effect, the formula assigns a state to each letter, and checks that this assignment is in accordance with the transitions of the machine.
We continued discussing MSO, and went over the formal definition of satisfaction (a model M satisfies a formula ϕ with respect to a valuation V, written M,V ⊧ ϕ). We saw how to define various useful predicates (like, first
, last
, consecutive
, etc).
We introduced monadic second order logic, and saw how we could use MSO formulae to describe (sets of) strings. The German wikipedia page gives a precise semantic interpretation to formulae, for those who speak it. (The English page gives an overview of important issues.) This paper by Wolfgang Thomas provides much more detail.
We enumerated a number of ways of defining classes of languages:
- grammars
- machines
- logic
- algebra
- language theoretically
We exemplified three of these today.
- machines
- we defined finite state machines as consisting of a finite set of states Q, a designated initial state q₀, a set of final states F, and a transition function δ : Q × Σ* → Q
- algebra
- we defined regular expressions as terms built over the signature of nullary constants 0, 1, a (for each alphabet symbol a), the unary symbol ·^∗, and the binary symbols · and +.
- language theoretically
- we defined the notion of the set of tails of a string with respect to a language L, and defined the Nerode equivalence relation of having the same set of tails as. We defined the property of having a finite set of distinct equivalence classes with respect to this relation.
We decided to use this colloquium to acquaint ourselves with the literature on formal language theory. I began by introducing grammars as sets of rewrite rules of the general form α → β. I noted that we could put syntactic restrictions on the form of these rules, and mentioned a few:
Name | Restriction |
---|---|
type 0 | none |
type 1 | α must not be longer than β |
type 2 | α must be a single nonterminal |
type 3 | type 2 + β contains at most |
one nonterminal, and this must | |
be right peripheral |
It is easy to see that all type 3 grammars are also type 2 grammars, and that grammars of any type are also of type 0. We showed that every type 2 language is also a type 1 language1 by observing that those type 2 rules which are not of type 1 are exactly those rules of the form A → ε, and that these rules may be systematically eliminated by adding new rules shadowing old ones but which do not contain certain instances of A on the right hand side.2