Number - Compression

1 - About

Columns of numerical values can often be efficiently compressed using two approaches:

  • bit packing
  • and run-length encoding (RLE)

3 - Method

3.1 - Bit packing

Bit packing uses the fact that small integers do not need a full 32 or 64 bits to be represented, and packs multiple values into the space normally occupied by a single value. There are multiple ways to do this.

Example:

  • Daniel Lemire’s JavaFastPFOR library

3.2 - Run-length encoding (RLE)

Run-length encoding turns “runs” of the same value, meaning multiple occurrences of the same value in a row, into just a pair of numbers: the value, and the number of times it is repeated.

4 - Documentation / Reference

data/type/number/compression.txt · Last modified: 2018/09/22 11:16 by gerardnico