Statistics - Standard Deviation (SD)

1 - About

The standard deviation is the average deviation from the mean in a distribution.

If you have a mean of 95% and a standard deviation of 2% in a normal distribution, 68% of the data are between 93% and 97%.

A larger sample should not affect the mean, but would reduce the standard deviation.

Only 3 units in 1000 will fall outside the area of 3 standard deviations either side of the centre line.

3 - Bias

Standard Deviation is not biased by sample size.

4 - Equation

The standard deviation is a standardized variance (ie the square root of the variance to brings the units back to the unit of the distribution because the variance is squared in order to get rid of a zero sum).

<MATH> \begin{array}{rrl} \text{Standard Deviation (SD)} & = & \sqrt{\href{variance}{Variance}} \\ & = & \sqrt{\frac{\displaystyle \href{Deviation Score}{\text{Sum of Squared (SS)}}}{\displaystyle \href{sample_size}{N}}} \\ & = & \sqrt{\frac{\displaystyle \sum_{i=1}^{\href{sample_size}{N}}{(\href{Residual}{\text{Residual}})^2}}{\displaystyle \href{sample_size}{N}}} \\ & = & \sqrt{\frac{\displaystyle \sum_{i=1}^{\href{sample_size}{N}}{(\href{raw_score}{X}_i- \href{mean}{\bar{X}})^2}}{\displaystyle \href{sample_size}{N}}} \\ \end{array} </MATH>

5 - Computation


import math
def std_deviation(variance):
    return math.sqrt(variance)
data_mining/standard_deviation.txt · Last modified: 2017/09/13 16:04 by gerardnico