# Statistics - Mallow's Cp

## 3 - Formula

$$\begin{array}{rrl} C_p & = & \frac{1}{n}(\href{RSS}{RSS} +2d \hat{\sigma}^2) \end{array}$$ where:

• d is the total # of parameters used
• $\hat{\sigma}^2$ is the variance estimate of the of the error associated with each response measurement (ie each error epsilon in the linear model)

## 4 - Restrictions

Cp is restricted to cases where n is bigger than p. If p is bigger than n, there is a problem because the full model (ie with all p predictors) is not defined and the error will be zero. Even if p is close to n, there will be a problem because the estimate of sigma squared might be far too low.

data_mining/cp.txt · Last modified: 2014/04/05 21:51 by gerardnico