That finite-state machines and regular expressions define the same class of languages is generally know as Kleene's Theorem. Its importance lies in the fact both descriptions provide important insights into the nature of these languages. Regular expressions have practical applications in computer science and finite-state machines allow us to answer general questions about the structure of these languages.
Given a finite symbol set we let be the set of strings in .
Given a string , we write where s is the concatination of
We write for .
In this Notation, for , where for and for .
Finally,
Let be a DFA . We can extend to by setting if
and there are states such that for all .
In this notation, the definition recognizes can be written " if and only if and is an accepting state
Lemma: The set of Regular Languages in are closed under the laws of Boolean Algebra.
Proof:
Let be recognized by Let It is immediate that recognizes
since if and only if and and if and only if and
Thus if and are Regular so are and . As is from an earlier discussion, , as is .
The Pumping Lemma:
states that if is a regular language then there is a number the number of states in ,such that if is any string in with then , the concatination of three strings, such that
is in for all
The proof is a simple counting argument. Let be a DFA that recognizes .
Let be recognized by and let Either
for some
or
there must be an and a with such that
Let
in case 1. or in case 2.
To complete the argument, setting by assumption . . But, also by assumption, is an accepting state so
is an accepting state.
The important generalization is that transitions are by any regular expression, not only members of or . The definition in the text is however tecnically not a generalization of a NDFA since it is limited to a single accepting state, not equal to the start state. NDFAs may have more than one accept state. The definition in the book has been called a "Reduced Generalized Nonderministic Finite Automaton."
It is also the case that the transition function does not take values in the set of states . Rather it takes values in the set of Regular Expressions over the alphabet Hence the definion of recognition must be appropriately modified. In particular, is recognized if we can write where the are strings, and find such that An understanding of this definition is at the hearth of an understanding of the "State Ripping" proceedure that is used to reduce any GNFA to one with two states.
Note that requires that there is a unique Regular expression transition between any pair of states in
including self-transitions
This diagram which was "ripped" from Van Dam's notes is at the heart of the reduction process.
Note that at each stage, the reduction step is applied simultaneously all
pairs of states that are part of the reduced automaton.
That the reduced automaton recognizes the same Language follows easily from
definition 2. above
This diagram is from the text. Homework: Due Sept 20 Problem 1.21 (b)