# Lexical Analysis - (Token|Lexical unit|Lexeme|Symbol|Word)

A token is symbols of the vocabulary of the language.

Each token is a single atomic unit of the language.

The token syntax is typically a regular language, so a finite state automaton constructed from a regular expression can be used to recognize it.

A token is:

The process of finding and categorizing tokens from an input stream is called “tokenizing” and is performed by a Lexer (Lexical analyzer).

Token represents symbols of the vocabulary of a language.

A token is the result of parsing the document down to the atomic elements generally of a language.

## 3 - Lexeme Type

A token might be:

Example:

Consider the following programming expression:

sum = 3 + 2;

Tokenized in the following table:

Token
Lexeme Lexeme type
sum Identifier
= Assignment operator
3 Integer literal