Lexical Analysis - (Syntax analysis|Parsing|Parser)

> Code - (Programming|Computer) Language > Computer Language - (Compiler|Interpreter) - Language translator

1 - About

A parser takes a token stream (emitted by a lexical analyzer) as input and based on the rules declared in the grammar (which define the syntactic structure of the source) produces a parse tree data structure.

A parser is generally generated from the grammar. See Language - Compiler compilers or (lexer|parser) generators

A parser is the component of a compiler that deals with the recursively nested features such as expressions arithmetic, conditional, and so on). Example of recursive grammatical rule a = a + a

Parsing is done generally at the token level but can be done at the character level when lexer and parser are done in one step: See Scannerless parsing

Syntax analysis is also known as:

  • Sentence recognition

In the process, the syntax will be checked.

Additional step can be added to the parse phase in order to construct an Abstract Syntax Tree (AST) from the parse tree.

The term parsing comes from Latin pars (orationis), meaning part (of speech)


3 - Type

3.1 - LL

An LL(k) parser :

  • is a top-down parser
  • that parses from left to right,
  • constructs a leftmost derivation of the input
  • and looks ahead k tokens when selecting between the grammar rule alternatives)

The * means any number of lookahead tokens.

LL parsers can't handle left-recursive rules.

The LL grammars is easier to understand than LR grammars.

3.2 - LR

LR(k) means: Left to right, Rightmost derivation parser.

LR(k), is:

  • a bottom-up parser
  • that parses from left to right
  • and constructs a rightmost derivation of the input.

3.2.1 - LALR

4 - Example of parsing algorithm

5 - Documentation / Reference