Number - Compression

Data System Architecture

Number - Compression

About

Columns of numerical values can often be efficiently compressed using two approaches:

  • bit packing
  • and run-length encoding (RLE)

Method

Bit packing

Bit packing uses the fact that small integers do not need a full 32 or 64 bits to be represented, and packs multiple values into the space normally occupied by a single value. There are multiple ways to do this.

Example:

  • Daniel Lemire’s JavaFastPFOR library

Run-length encoding (RLE)

Run-length encoding turns “runs” of the same value, meaning multiple occurrences of the same value in a row, into just a pair of numbers: the value, and the number of times it is repeated.

Documentation / Reference







Share this page:
Follow us:
Task Runner