Computer Language - (Compiler|Interpreter) - Language translator

1 - About

Computer Language are written in plain text. However, computers interpret only particular sequence of instructions. This transformation from a plain text language to instructions is called compilation and is done by a compiler. Once a program's code is compiled, the program's code has been turned into machine language.

The first compiler was written by Grace Hopper, in 1952, for the A-0 System language. The term compiler was coined by Hopper.

The translation program is called a compiler, and the text to be translated is called source text (or sometimes source code).

The first compiler was for the language Fortran (formula translator) around 1956. The intricacy and complexity of the translation process could be reduced only by choosing a clearly defined, well structured source language. This occurred for the first time in 1960 with the advent of the language Algol 60, which established the technical foundations of compiler design that still are valid today. For the first time, a formal notation was also used for the definition of the language's structure (Naur, 1960).

The translation process is now guided by the structure of the analysed text. The text is:

  • decomposed,
  • parsed into its components according to the given syntax.

For the most elementary components, their semantics is recognized. The meaning of the source text must be preserved by the translation.

When a code doesn't need a compiler, it runs using an interpreter and is then interpreted. The translation process is done typically from top to bottom, line by line, every time the program is run.

3 - JIT (Just in Time Compilation)

A system implementing a JIT compiler typically continuously analyses the code being executed and identifies parts of the code where the speedup gained from compilation would outweigh the overhead of interpreting that code. JIT compilation combines the speed of compiled code with the flexibility of interpretation

More :

4 - Translation (steps|pass)

The translation process essentially consists of the following parts:

  1. Lexical analysis (Lexer): The sequence of characters of a source text is translated into token (symbols of the vocabulary of the language)
  2. Syntax analysis (Parser)): The sequence of token is transformed into a representation that directly mirrors the syntactic structure of the source. Checking: In addition to syntactic rules, compatibility rules (types of operators and operands) that define the language are verified. This phase builds as first a concrete syntax tree (CST, parse tree), and then transform it into an abstract syntax tree (AST, syntax tree).
  3. Semantic analysis. Semantic analysis is the phase in which the compiler adds semantic information to the parse tree.
  4. Code generation: A sequence of instructions taken is generated. In general it is the most involved part and was break in multi-part of pass.

The lexer identifies tokens that adheres to the grammar and the parser makes sense of these tokens.

Process Input element Algorithm Syntax Syntactic analysis
Lexical analysis Character Scanner Regular Word Syntax
Syntax analysis Symbol (usually called tokens) Parser Context free Phrase Syntax

5 - Implementation

5.1 - Lexer and parser generation

In simple cases, the lexer and the parser are automatically generated from the grammar file of the language with a Compiler-Compiler. In more complex cases, manual modifications or written by hand are required.

The lexical grammar and phrase grammar are usually context-free grammars, which simplifies analysis significantly, with context-sensitivity handled at the semantic analysis phase. The semantic analysis phase is generally more complex and written by hand, but can be partially or fully automated using attribute grammars.

5.2 - Lexical Analysis and Parsing in one step

5.2.1 - Serially

Lexical Analysis can be combined with the parsing step in scannerless parsing. Parsing is done at the character level, not the token level.

5.2.2 - Concurrently

In processing computer languages, semantic processing generally comes after syntactic processing (parser), but in some cases semantic processing is necessary for complete syntactic analysis, and these are done together or concurrently.

5.3 - One pass

6 - Type

6.1 - Cross

A compiler which generates code for a computer different from the one executing the compiler is called a cross compiler. The generated code is then transferred to the device.

7 - Documentation / Reference

code/compiler/compiler.txt ยท Last modified: 2017/09/17 18:26 by gerardnico