Simplifying Context Free Grammars
The definition of context free grammars (CFGs) allows us to develop a wide variety of grammars. Most of the time, some of the productions of CFGs are not useful and are redundant. This happens because the definition of CFGs does not restrict us from making these redundant productions.
By simplifying CFGs we remove all these redundant productions from a grammar , while keeping the transformed grammar equivalent to the original grammar. Two grammars are called equivalent if they produce the same language. Simplifying CFGs is necessary to later convert them into Normal forms.
Types of redundant productions and the procedure of removing them are mentioned below.
1. Useless productions β The productions that can never take part in derivation of any string , are called useless productions. Similarly , a variable that can never take part in derivation of any string is called a useless variable. For eg.
S -> abS | abA | abB A -> cd B -> aB C -> dc
In the example above , production βC -> dcβ is useless because the variable βCβ will never occur in derivation of any string. The other productions are written in such a way that variable βCβ can never reached from the starting variable βSβ.
Production βB ->aBβ is also useless because there is no way it will ever terminate . If it never terminates , then it can never produce a string. Hence the production can never take part in any derivation.
To remove useless productions , we first find all the variables which will never lead to a terminal string such as variable βBβ. We then remove all the productions in which variable βBβ occurs.
So the modified grammar becomes β
S -> abS | abA A -> cd C -> dc
We then try to identify all the variables that can never be reached from the starting variable such as variable βCβ. We then remove all the productions in which variable βCβ occurs.
The grammar below is now free of useless productions β
S -> abS | abA A -> cd
2. ? productions β The productions of type βA -> ?β are called ? productions ( also called lambda productions and null productions) . These productions can only be removed from those grammars that do not generate ? (an empty string). It is possible for a grammar to contain null productions and yet not produce an empty string.
To remove null productions , we first have to find all the nullable variables. A variable βAβ is called nullable if ? can be derived from βAβ. For all the productions of type βA -> ?β , βAβ is a nullable variable. For all the productions of type βB -> A1A2β¦An β , where all βAiβs are nullable variables , βBβ is also a nullable variable.
After finding all the nullable variables, we can now start to construct the null production free grammar. For all the productions in the original grammar , we add the original production as well as all the combinations of the production that can be formed by replacing the nullable variables in the production by ?. If all the variables on the RHS of the production are nullable , then we do not add βA -> ?β to the new grammar. An example will make the point clear. Consider the grammar β
S -> ABCd (1) A -> BC (2) B -> bB | ? (3) C -> cC | ? (4)
Lets first find all the nullable variables. Variables βBβ and βCβ are clearly nullable because they contain β?β on the RHS of their production. Variable βAβ is also nullable because in (2) , both variables on the RHS are also nullable. So variables βAβ , βBβ and βCβ are nullable variables.
Lets create the new grammar. We start with the first production. Add the first production as it is. Then we create all the possible combinations that can be formed by replacing the nullable variables with ?. Therefore line (1) now becomes βS -> ABCd | ABd | ACd | BCd | Ad | Bd |Cd | d β.We apply the same rule to line (2) but we do not add βA -> ?β even though it is a possible combination. We remove all the productions of type βV -> ?β. The new grammar now becomes β
S -> ABCd | ABd | ACd | BCd | Ad | Bd |Cd | d A -> BC | B | C B -> bB | b C -> cC | c
3. Unit productions β The productions of type βA -> Bβ are called unit productions.
To create a unit production free grammar βGufβ from the original grammar βGβ , we follow the procedure mentioned below.
First add all the non-unit productions of βGβ in βGufβ. Then for each variable βAβ in grammar βGβ , find all the variables βBβ such that βA *=> Bβ. Now , for all variables like βA β and βBβ, add βA -> x1 | x2 | β¦xnβ to βGufβ where βB -> x1 | x2 | β¦xn β is in βGufβ . None of the x1 , x2 β¦ xn are single variables because we only added non-unit productions in βGufβ. Hence the resultant grammar is unit production free. For eg.
S -> Aa | B A -> b | B B -> A | a
Lets add all the non-unit productions of βGβ in βGufβ. βGufβ now becomes β
S -> Aa A -> b B -> a
Now we find all the variables that satisfy βX *=> Zβ. These are βS*=>Bβ, βA *=> Bβ and βB *=> Aβ. For βA *=> Bβ , we add βA -> aβ because βB ->aβ exists in βGufβ. βGufβ now becomes
S -> Aa A -> b | a B -> a
For βB *=> Aβ , we add βB -> bβ because βA -> bβ exists in βGufβ. The new grammar now becomes
S -> Aa A -> b | a B -> a | b
We follow the same step for βS*=>Bβ and finally get the following grammar β
S -> Aa | b | a A -> b | a B -> a | b
Now remove B -> a|b , since it doesnt occur in the production βSβ, then the following grammar becomes,
S->Aa|b|a A->b|a
Note: To remove all kinds of productions mentioned above, first remove the null productions, then the unit productions and finally , remove the useless productions. Following this order is very important to get the correct result.