For any context-free language
, there is a number
such that if
and
then
can be written as a concatination of strings
such that:
for
all
.
That is either
or
or
both are
An application followed by the proof itself:
Application:
is
not a CFL.
Proof:
If it were, applying the Theorem, neither
nor
can
be a mix of
s,
s,
and or
s.
Since, for example if
then
would
contain
which
is not in the CFL.
Thus , say
and
.
By 2, we know
or
. Again
is not in the CFL.
since the number of
s
or
s,
or both differ from the number of
s
Exercise (Due April 25): Show
is
not a CFL.
Answer:
Applying the Theorem, neither
nor
can
be a mix of
s
or,
s,but
at least one must be of length
.
Again, going through the various cases,
if for example
and
in
the first
and
then
is not in the CFL
The Proof of the Theorem:
Let
be a grammer in Chomsky Normal Form generating
Since the two possible forms that a non-null production can take is
where
but
neither
nor
can be
.
where
and
is in
Any parse tree generated from
whose longest path length
is
can recognize a string of length at most
Assuming
has
nonterminals pick a string
such
that
.
Any parse tree for
must contain a path of length greater than
with at least one nonterminal
occuring more than once. The derivation would look
like.
.
The proof follows by a simple induction on
.
For example:
The proof of
amounts
to a judicious choce of
,
and of the "repeat" path .
Choosing the shortest, it cannot have a sub- repeat so its length is
and given that Chomsky Normal Grammer strings can only add one with each
additional path step