Statistics - Adjusted R^2

1 - About

A big R squared indicates a model that really fits the data well. But unfortunately, you can't compare models of different sizes by just taking the one with the biggest R squared because you can't compare the R squared of a model with three variables to the R squared of a model with eight variables, for instance because the models with the most variables will always fit better the data. So the adjusted R squared tries to fix this issue.

With adjusted R squared, we pay a price for having a large model, unlike the classical r squared, where we pay no price for having a large model with a lot of features.

3 - Formula

A large value of adjusted R2 indicates a model with a small test error

In order to be able to compare two models of differents size, Adjusted R Squared makes you pay a price for having a large model. Adjusted R squared adjusts the R squared so that the values that you get are comparable even if the numbers of predictors are different. It does this by adding a denominator to RSS and to TSS in the below ratio.

For a least squares model with d variables, the adjusted R squared statistic is calculated as

<MATH> \text{Adjusted }R^2 = 1 - \frac{ \displaystyle \frac{RSS}{n-d-1} } { \displaystyle \frac{TSS}{n-1} } </MATH>


  • TSS is the total sum of squares
  • d is the total # of parameters used
  • n is the s

When d is large, the denominator is really large. You're dividing the RSS by a really big number, and you're going to end up with a smaller R squared.

4 - Advantage

Compared to the others criterias such as Cp, AIC and BIC, it doesn't require an estimate of sigma squared. So you can also apply it when p is bigger than n.

That's a really nice advantage of RSS.

With adjusted R squared, you can't really generalize to other types of models such as logistic regression.

data_mining/adjusted_r_squared.txt · Last modified: 2016/06/21 11:35 by gerardnico