# Cryptography - Hash

A hash function is a mathematical function which converts/encrypt a large, possibly variable-sized amount of data into a small datum, usually a single integer that may serve as an index to an array.

The hash value is shorter and of fixed-length.

The input data is often called the message, and the output (the hash value or hash) is often called the message digest or simply the digest.

The values returned by a hash function are called :

• hash values,
• hash codes,
• hash sums,
• hash key,
• or simply hashes.

It's a one way function (you can't retrieve the input) but the output wil change for a slight change in the input.

Hash functions are used to:

• retrieve data

In database application, Hash functions are mostly used to speed up:

such as :

## 5 - Property

### 5.1 - Character Position

Position Independent: CRC16 and CRC32 (cyclical redundancy checksum, 16 bit and 32 bit). The hash of Nico or ociN will be the same. The position of the letter doesn't matter.

## 6 - Collision

An hash collision happens when two different inputs produce the same result. Hash collision are very similar to the Birthday_problem.

### 6.1 - Rate

If you look at two arbitrary values, the collision probability is only

$$2^{\frac{n}{2}} = 2^{-128} \text{(for 2 arbitrary value)}$$

Odds of a hash collision when you know the number of value to hash (Source)

## 7 - Distributed System

In distributed system, you hash because you don't want to request always the same server. If you hash by time for instance, you can have only one server that serves the most request (ie for the current time)

### 7.1 - Regular

Regular Hashing = Round Robin = Modular function

Assign M data keys to N servers assign each key to server = k mod N where:

• k is the key
• and N is the number of server

For instance:

• key 0 → server 0
• key 1 → server 1
• key 2 → server 2
• key 3 → server 0
• key 4 → server 1
• ….

If the number of servers increase from N to 2N, every existing key needs to be remapped, relocated.