Statistics - Correlation Matrix

Card Puncher Data Processing

Statistics - Correlation Matrix

About

From a raw matrix to a correlation matrix.

A correlation matrix is a special matrix used in statistics. It is a square symmetric matrix.

Steps

Raw Matrix

3 columns (3 variables), 8 rows (8 individuals)

<MATH> A_{ij} = \begin{bmatrix} 1 & 3 & 4 \\ 2 & 5 & 4 \\ 0 & 0 & 1 \\ 3 & 2 & 3 \\ 1 & 0 & 5 \\ 4 & 4 & 3 \\ 4 & 5 & 2 \\ 3 & 2 & 3 \\ \end{bmatrix} </MATH>

Sum Matrix

<MATH> S_{1j} = 1_{1i} . A_{ij} = \begin{bmatrix} 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 \\ \end{bmatrix} . \begin{bmatrix} 1 & 3 & 4 \\ 2 & 5 & 4 \\ 0 & 0 & 1 \\ 3 & 2 & 3 \\ 1 & 0 & 5 \\ 4 & 4 & 3 \\ 4 & 5 & 2 \\ 3 & 2 & 3 \\ \end{bmatrix} = \begin{bmatrix} 18 & 21 & 25 \\ \end{bmatrix} </MATH>

Mean Vector

<MATH> Mv_{1j} = S_{1j} . N^{-1} = \begin{bmatrix} 18 & 21 & 25 \\ \end{bmatrix} .8^{-1} = \begin{bmatrix} 2.25 & 2.62 & 3.12 \\ \end{bmatrix} </MATH>

where N is the number of rows.

Mean Matrix

<MATH> Mm_{ij} = 1_{i1} . Mv_{1j} = \begin{bmatrix} 1 \\ 1 \\ 1 \\ 1 \\ 1 \\ 1 \\ 1 \\ 1 \\ \end{bmatrix} . \begin{bmatrix} 2.25 & 2.62 & 3.12 \\ \end{bmatrix} = \begin{bmatrix} 2.25 & 2.62 & 3.12 \\ 2.25 & 2.62 & 3.12 \\ 2.25 & 2.62 & 3.12 \\ 2.25 & 2.62 & 3.12 \\ 2.25 & 2.62 & 3.12 \\ 2.25 & 2.62 & 3.12 \\ 2.25 & 2.62 & 3.12 \\ 2.25 & 2.62 & 3.12 \\ \end{bmatrix} </MATH>

Deviation Score Matrix

Deviation Score Matrix

<MATH> D_{ij} = A_{ij}-Mm_{ij} = \begin{bmatrix} 1 & 3 & 4 \\ 2 & 5 & 4 \\ 0 & 0 & 1 \\ 3 & 2 & 3 \\ 1 & 0 & 5 \\ 4 & 4 & 3 \\ 4 & 5 & 2 \\ 3 & 2 & 3 \\ \end{bmatrix} - \begin{bmatrix} 2.25 & 2.62 & 3.12 \\ 2.25 & 2.62 & 3.12 \\ 2.25 & 2.62 & 3.12 \\ 2.25 & 2.62 & 3.12 \\ 2.25 & 2.62 & 3.12 \\ 2.25 & 2.62 & 3.12 \\ 2.25 & 2.62 & 3.12 \\ 2.25 & 2.62 & 3.12 \\ \end{bmatrix} = \begin{bmatrix} \begin{array}{rrr} -1.25 & 0.38 & 0.88 \\ -0.25 & 2.38 & 0.88 \\ -2.25 & -2.62 & -2.12 \\ 0.75 & -0.62 & -0.12 \\ -1.25 & -2.62 & 1.88 \\ 1.75 & 1.38 & -0.12 \\ 1.75 & 2.38 & -1.12 \\ 0.75 & -0.62 & -0.12 \\ \end{array} \end{bmatrix} </MATH>

SS and SP Product

Sum Square of Deviation Score and Som of cross product

<MATH> \begin{array}{rrccc} S_{jj} & = & D_{ij} & . & {D_{ij}}^T \\ S_{jj} & = & \begin{bmatrix} \begin{array}{rrr} -1.25 & 0.38 & 0.88 \\ -0.25 & 2.38 & 0.88 \\ -2.25 & -2.62 & -2.12 \\ 0.75 & -0.62 & -0.12 \\ -1.25 & -2.62 & 1.88 \\ 1.75 & 1.38 & -0.12 \\ 1.75 & 2.38 & -1.12 \\ 0.75 & -0.62 & -0.12 \\ \end{array} \end{bmatrix} & . & \begin{bmatrix} \begin{array}{rrrrrrrr} -1.25 & -0.25 & -2.25 & 0.75 & -1.25 & 1.75 & 1.75 & 0.75 \\ 0.38 & 2.38 & -2.62 & -0.62 & -2.62 & 1.38 & 2.38 & -0.62 \\ 0.88 & 0.88 & -2.12 & -0.12 & 1.88 & -0.12 & -1.12 & -0.12 \\ \end{array} \end{bmatrix} \\ \end{array} </MATH>

where:

  • the sum Square of Deviation Score are in the diagonal
  • the sum of cross products are in the off diagonal

Variance and covariance Matrix

<MATH> VCoV_{jj} = S_{jj}. N^{-1} </MATH>

Standard Deviation Matrix

<MATH> SD_{jj} = Diag(VCoV_{jj})^{\frac{1}{2}} </MATH>

Correlation matrix

correlation coefficient

<MATH> R_{jj} = {SD_{jj}}^{-1}. VCoV_{jj}.{SD_{jj}}^{-1} </MATH>

Multiple regression coefficients are estimated simultaneously using matrix algebra. In matrix form, the formula is [B = (X’X) -1X’Y]. The matrix inversion is required to isolate B on one side of the equation.





Discover More
Multiple Regression Representation Hyperplane
Statistics - Multiple Linear Regression

Multiple regression is a regression with multiple predictors. It extends the simple model. You can have many predictor as you want. The power of multiple regression (with multiple predictor) is to better...



Share this page:
Follow us:
Task Runner