Statistics - (Variance|Dispersion|Mean Square) (MS)

1 - About

The variance shows how widespread the individuals are from the average.

The variance is how much that the estimate varies around its average.

It's a measure of consistency. A very large variance means that the data were all over the place, while a small variance (relatively close to the average) means that the majority of the data are closed.

See:

3 - Formula

<MATH> \begin{array}{rrl} Variance & = & \frac{\displaystyle \sum_{i=1}^{\href{sample_size}{N}}{(\href{raw_score}{X}_i- \href{mean}{\bar{X}})^2}}{\displaystyle \href{degree_of_freedom}{\text{Degree of Freedom}}} \\ & = & \frac{\displaystyle \sum_{i=1}^{\href{sample_size}{N}}{(\href{Deviation Score}{\text{Deviation Score}}_i)^2}}{\displaystyle \href{degree_of_freedom}{\text{Degree of Freedom}}} \\ & = & (\href{Standard_Deviation}{\text{Standard Deviation}})^2 \end{array} </MATH>

where:

4 - Addition

<MATH> Var(X + Y) = Var(X) + Var(Y) + 2 Cov(X, Y) </MATH> where:

5 - Computation

5.1 - Python

units = [7, 10, 9, 4, 5, 6, 5, 6, 8, 4, 1, 6, 6]
 
def units_average(units):
    average = sum(units) / len(units)
    return average
 
def units_variance(units,average):
    diff = 0
    for unit in units:
        diff += (unit - average) ** 2
    return diff / len(units)
 
print units_variance(units, units_average(units))
5
data_mining/variance.txt · Last modified: 2017/11/16 23:17 by gerardnico