Antlr - Parser Rule

> Antlr (ANother Tool for Language Recognition)

1 - About

Compiler - Parser Rule in Antlr.

Parser rule is the second type of rule for Antlr.

The lexer rules specify the tokens whereas the parser rules specify the tree.

Advertising

3 - Syntax

Parser rule names always start with a lowercase letter (whereas Lexer rule names (known als as Token name) must begin with an uppercase letter.)

3.1 - Basic

A rule name followed by a single alternative terminated with a semicolon.

retstat : 'return' expr ';' ;

3.2 - Alternative

Rules can also have alternatives separated by the logical matcher |

operator:
    stat: retstat
    | 'break' ';'
    | 'continue' ';'
    ;

3.2.1 - Label and tree event

Parse-tree listener events are created when labeling the outermost alternatives of a rule using the # operator.

All alternatives within a rule must be labeled, or none of them.

Example: Two rules with alternatives

grammar T;
stat: 'return' e ';' # Return
    | 'break' ';' # Break
    ;
e   : e '*' e # Mult
    | e '+' e # Add
    | INT # Int
    ;

Alternative labels do not have to be at the end of the line and there does not have to be a space after the # symbol.

ANTLR generates a rule context class definition for each label. For example, here is the listener that ANTLR generates:

public interface AListener extends ParseTreeListener {
    void enterReturn(AParser.ReturnContext ctx);
    void exitReturn(AParser.ReturnContext ctx);
    void enterBreak(AParser.BreakContext ctx);
    void exitBreak(AParser.BreakContext ctx);
    void enterMult(AParser.MultContext ctx);
    void exitMult(AParser.MultContext ctx);
    void enterAdd(AParser.AddContext ctx);
    void exitAdd(AParser.AddContext ctx);
    void enterInt(AParser.IntContext ctx);
    void exitInt(AParser.IntContext ctx);
}

There are enter and exit methods associated with each labeled alternative. The parameters to those methods are specific to alternatives.

You can reuse the same label on multiple alternatives to indicate that the parse tree walker should trigger the same event for those alternatives. For example, here’s a variation on rule e from grammar A above:

e : e '*' e # BinaryOp
| e '+' e # BinaryOp
| INT # Int
;

ANTLR would generate the following listener methods for e:

void enterBinaryOp(AParser.BinaryOpContext ctx);
void exitBinaryOp(AParser.BinaryOpContext ctx);
void enterInt(AParser.IntContext ctx);
void exitInt(AParser.IntContext ctx);
Advertising

3.3 - Optional

Alternatives are either a list of rule elements or empty. For example, here’s a rule with an empty alternative that makes the entire rule optional:

superClass
    : 'extends' ID
    | // empty means other alternative(s) are optional
    ;

4 - Example

url : authority '://' login? host (':' port)? ('/' path)? ('?' search)? ;
authority : STRING;
hostname : STRING ('.' STRING)* ;
host_ip : DIGITS '.' DIGITS '.' DIGITS '.' DIGITS ;
host : hostname | hostnumber;
path: STRING ('/' STRING)* ;
user: STRING;
port: DIGITS;

5 - Documentation / Reference