A Proof of the CFL Pumping Lemma using Chomsky Normal Form

Theorem (The CFL Pumping Lemma):

For any context-free language $\QTR{Large}{A}$ , there is a number $\QTR{Large}{p}$ such that if MATHand MATH

then $\QTR{Large}{s}$ can be written as a concatination of strings $\QTR{Large}{xuyvz}$ such that:

  1. MATHfor all MATH

  2. MATH. That is either MATH or MATHor both are

  3. MATH

An application followed by the proof itself:

Application: MATH MATHis not a CFL.

Proof:

If it were, applying the Theorem, neither $\QTR{Large}{u}$ nor $\QTR{Large}{v}$ can be a mix of $\QTR{Large}{a}$ s, $\QTR{Large}{b}$ s, and or $\QTR{Large}{c}$ s. Since, for example if

MATH MATHthen MATHwould contain MATH MATH MATHwhich is not in the CFL.

Thus , say MATH and MATH. By 2, we know $\QTR{Large}{j}$ or $\QTR{Large}{k>0\ }$ . Again MATH is not in the CFL.

since the number of $\QTR{Large}{a}$ s or $\QTR{Large}{b}$ s, or both differ from the number of $\QTR{Large}{c}$ s

Exercise (Due April 25): Show MATH MATHis not a CFL.

Answer:

Applying the Theorem, neither $\QTR{Large}{u}$ nor $\QTR{Large}{v}$ can be a mix of $\QTR{Large}{a}$ s or, $\QTR{Large}{b}$ s,but at least one must be of length $\QTR{Large}{>0}$.

Again, going through the various cases,

if for example MATH and MATHin the first MATH and MATHthenMATH is not in the CFL




The Proof of the Theorem:

Let $\QTR{Large}{G}$ be a grammer in Chomsky Normal Form generating $\QTR{Large}{A}$ Since the two possible forms that a non-null production can take is

  1. MATH where MATHbut neither $\QTR{Large}{u\ }$nor $\QTR{Large}{w\ }$ can be $\QTR{Large}{S}$

  2. .MATH where MATHand $\QTR{Large}{u\ }$is inMATH

Any parse tree generated from $\QTR{Large}{G}$ whose longest path length is$\ \QTR{Large}{j}$ can recognize a string of length at most MATH

Assuming $\QTR{Large}{G\ }$has $\QTR{Large}{n}$ nonterminals pick a string MATHsuch that MATH. Any parse tree for $\QTR{Large}{g}$ must contain a path of length greater than $\QTR{Large}{n}$ with at least one nonterminal $\QTR{Large}{q}$ occuring more than once. The derivation would look like.$\medskip $

MATH.$\medskip $

The proof follows by a simple induction on $\QTR{Large}{i\ }$. For example:

MATH

The proof of $3.$amounts to a judicious choce of $\QTR{Large}{q}$, and of the "repeat" path .

MATH

Choosing the shortest, it cannot have a sub- repeat so its length is MATH and given that Chomsky Normal Grammer strings can only add one with each additional path step MATH