Data Mining - Noise (Unwanted variation)

> (Statistics|Probability|Machine Learning|Data Mining|Data and Knowledge Discovery|Pattern Recognition|Data Science|Data Analysis)

1 - About

Information from all past experience can be divided into two groups:

  • information that is relevant for the future (“signal”)
  • information that is irrelevant (“noise”).

In many cases the factors causing the unwanted variation are unknown and must be inferred from the data.

Noise can be seen as the result of:

The noise tries to be represented by calculating the prediction error

Advertising

3 - Type

3.1 - Random

All information got random noise that is related to the data collection process.

Example: reading of GPS 'jump around' though always remaining within a few meters of the real position.

3.2 - Systematic

4 - Documentation / Reference