Statistics - (Normalize|Standardize)

> (Statistics|Probability|Machine Learning|Data Mining|Data and Knowledge Discovery|Pattern Recognition|Data Science|Data Analysis)

1 - About

(Normalize|Standardize) is a scale transformation on a numeric variable distribution in order to have:

  • zero as mean (See Z score)
  • the max and the min of the distribution into a given numeric range (normally between -1 and 1)
  • zero mean and unit variance

By normalizing, you get your data (variable) between 0 and 1 so the calculations aren't skewed by some attribute that happens to be on some gigantic scale


3 - Variables

To Standardize a variables, you take the variable value and you divide it by the standard deviation of that variable over all the observations. See Statistics - Z Score (Zero Mean) or Standard Score

A standardize variable makes the variable and their coefficients comparable when they have different unit.