Data Mining - Noise (Unwanted variation)

1 - About

Information from all past experience can be divided into two groups:

  • information that is relevant for the future (“signal”)
  • information that is irrelevant (“noise”).

In many cases the factors causing the unwanted variation are unknown and must be inferred from the data.

Noise can be seen as the result of:

The noise tries to be represented by calculating the prediction error

3 - Type

3.1 - Random

All information got random noise that is related to the data collection process.

Example: reading of GPS 'jump around' though always remaining within a few meters of the real position.

3.2 - Systematic

4 - Documentation / Reference

data_mining/noise.txt · Last modified: 2017/11/16 15:46 by gerardnico