Multilingual Regular Expression Syntax (Pattern)
1 - About
This section talks about the Regular Expression Syntax.
Stephen Kleene was the fellow who invented regular expressions and showed that they describe the same languages that finite automata describe. Most regexp-packages transform a regular expression into an automaton based on nondeterministic automata (NFA). The automaton does then the parse work.
2 - Syntax
A regular expression is composed of:
See meta to get an overview of the most important regexp (symbols|token) that have a meaning in the context of regular expression.
3 - Grammar
The syntax can differ from one implementation to an other.
4 - Example
Java regular expression used to parse log message at Twitter circa 2010
^(\\w+\\s+\\d+\\s+\\d+:\\d+:\\d+)\\s+ ([^@]+?)@(\\s+)\\s+(\\S+):\\s+(\\S+)\\s+(\\S+) \\s+((?:\\S+?,\\s+)*(?:\\S+?))\\s+(\\S+)\\s+(\\S+) \\s+\\[([^\\]]+)\\]\\s+\"(\\w+)\\s+([^\"\\\\]* (?:\\\\.[^\"\\\\]*)*)\\s+(\\S+)\"\\s+(\\S+)\\s+ (\\S+)\\s+\"([^\"\\\\]*(?:\\\\.[^\"\\\\]*)*) \"\\s+\"([^\"\\\\]*(?:\\\\.[^\"\\\\]*)*)\"\\s* (\\d*-[\\d-]*)?\\s*(\\d+)?\\s*(\\d*\\.[\\d\\.]*)? (\\s+[-\\w]+)?.*$