We have seen context-free grammar. We now look at context free languages, important properties, and a machine akin to a DFA but that will also accept these CFL’s.

The Pumping Lemma for CFLs

A language accepted by a DFA was infinite if there was a loop present.

We look at the parse tree of a CFG instead. Consider the following.

Screenshot 2023-02-23 at 22.10.50.png

We essentially ‘cut out’ the $R$ subtree, and repasted it below another $R$. We can repeat this process any number of times, such that $uv^ixy^iz$ is in the language $\forall i \in \mathbb{N}_0$, which gives us an infinite number of strings.


This leads us naturally to a pumping lemma for CFL’s.

<aside> 🤔 If a derived string is ‘too long’, it must repeat a non-terminal somewhere inside the parse tree.

</aside>

Note we can also ‘pump down’ in the above example.

Note we can also ‘pump down’ in the above example.

The Lemma

If $L$ is a CFL, the $\exists m > 0$ such that

Pumping length

What should the pumping length $m$ be? Well, for every string in $L$ of length $\geq m$, we want the parse tree for the string to have a root-leaf path where a variable is repeated, in an ancestor-decendant relationship.

  1. Note that the length of the longest ‘string’ on the RHS of a production rule equates to the most number of children any node in the parse tree can have.

    <aside> 💕 CNF therefore gives a binary parse tree!

    </aside>

  2. If we have the most number of children as $b$, and the height of our parse tree as $h$, then we can say the most number of leaves our tree has is $b^h$