Number - Compression

> (Data|State) Management and Processing > (Data Type|Data Structure) > Number, Numeric, Quantity

1 - About

Columns of numerical values can often be efficiently compressed using two approaches:

  • bit packing
  • and run-length encoding (RLE)

3 - Method

3.1 - Bit packing

Bit packing uses the fact that small integers do not need a full 32 or 64 bits to be represented, and packs multiple values into the space normally occupied by a single value. There are multiple ways to do this.


  • Daniel Lemire’s JavaFastPFOR library

3.2 - Run-length encoding (RLE)

Run-length encoding turns “runs” of the same value, meaning multiple occurrences of the same value in a row, into just a pair of numbers: the value, and the number of times it is repeated.

4 - Documentation / Reference