Grammar

A grammar defining formal language L is a quadruple (N,T,R,S), where N is a finite set of nonterminals, T is a finite set of terminal symbols, R is a finite set of productions, and S is an element of N.

The set T of terminal symbols is L’s alphabet. Nonterminals are symbols representing language constructs. The sets N and T should not intersect. S is called the start symbol. Productions are rules of the form: alpha->beta, where both alpha and beta are strings of terminals and nonterminals, alpha contains at least one nonterminal.

Sentential forms for grammar G=(N,T,R,S) are defined by the following rules: S is a sentential form and if alphabetagamma is a sentential form and production beta->delta belongs to R, then alphadeltagamma is a sentential form as well.

L is the set of all strings which are sentential forms consisting entirely of terminal symbols. For a language defined by a grammar, recognition whether a given string (expression) belongs to that language is, in general, a non-trivial task. All languages defined by grammars are recursively enumerable sets.

1. A grammar G is called right linear if all its productions have the form A->alphaB or A->alpha, where A,B in N and alpha is a string of terminal symbols.

2. A grammar G is called context-free if all its productions have the form A->alpha, where A in N and alpha is a string of terminal and nonterminal symbols.

3. A grammar G is called context-sensitive if all its productions have the form alpha->beta, where both alpha and beta are strings of terminal and nonterminal symbols and the length of alpha is not more than the length of beta.

4. A grammar G is called unrestricted if it does not belong to categories 1 through 3.

This hierarchy of grammars was introduced by N. Chomsky. The set of languages defined by grammars of every category is a proper superset of that for the previous category. The languages defined by grammars of categories 1 through 3 are recursive sets. A language can be defined by a grammar of category 1 iff it is defined by a regular expression.

A grammar defining formal language L is a quadruple (N,T,R,S), where N is a finite set of nonterminals, T is a finite set of terminal symbols, R is a finite set of productions, and S is an element of N.

The set T of terminal symbols is L’s alphabet. Nonterminals are symbols representing language constructs. The sets N and T should not intersect. S is called the start symbol. Productions are rules of the form: alpha->beta, where both alpha and beta are strings of terminals and nonterminals, alpha contains at least one nonterminal.

Sentential forms for grammar G=(N,T,R,S) are defined by the following rules: S is a sentential form and if alphabetagamma is a sentential form and production beta->delta belongs to R, then alphadeltagamma is a sentential form as well.

L is the set of all strings which are sentential forms consisting entirely of terminal symbols. For a language defined by a grammar, recognition whether a given string (expression) belongs to that language is, in general, a non-trivial task. All languages defined by grammars are recursively enumerable sets.

1. A grammar G is called right linear if all its productions have the form A->alphaB or A->alpha, where A,B in N and alpha is a string of terminal symbols.

2. A grammar G is called context-free if all its productions have the form A->alpha, where A in N and alpha is a string of terminal and nonterminal symbols.

3. A grammar G is called context-sensitive if all its productions have the form alpha->beta, where both alpha and beta are strings of terminal and nonterminal symbols and the length of alpha is not more than the length of beta.

4. A grammar G is called unrestricted if it does not belong to categories 1 through 3.

This hierarchy of grammars was introduced by N. Chomsky. The set of languages defined by grammars of every category is a proper superset of that for the previous category. The languages defined by grammars of categories 1 through 3 are recursive sets. A language can be defined by a grammar of category 1 iff it is defined by a regular expression.