Chomsky normal form
Context-free grammars can differ in their rule sets yet generate the same language. From the set of such equivalent grammars, some are easier to reason about.
One type of context-free grammar that is convenient to work with is the Chomsky normal form.
Chomsky normal form
A context-free grammar is in Chomsky normal form iff the RHS of every rule is either:
- two symbols that both appear as LHS symbols in another rule excluding the start rule, or
- one symbol that does not appear as a LHS symbol.
The above definition is non-standard. The more common definition is on the back side.
There is an important theorem regarding Chomsky normal form below. Can you remember it?
Any context-free language is generated by a context-free grammar in Chomsky normal form.
Chomsky normal form (standard definition)
A context-free grammar is in Chomsky normal form iff every rule is of the form:
where \( a \) is a terminal symbol, \( S \) is the start symbol and \( A, B \) and \( C \) are non-terminal symbols.
It is assumed that \( \varepsilon \) can appear on the RHS of a rule.