Statistics - R-squared (<math>R^2</math>|Coefficient of determination) for Model Accuracy

> (Statistics|Probability|Machine Learning|Data Mining|Data and Knowledge Discovery|Pattern Recognition|Data Science|Data Analysis)

1 - About

<math>R^2</math> is an accuracy statistics in order to assess a regression model. It's a summary of the model.

<math>R^2</math> is the percentage of variance in Y explained by the model, the higher, the better.

The largest r squared is equivalent to the smallest residual sum of squares.

R squared is also known as:

  • the fraction of variance explained.
  • the sum of squares explained.
  • Coefficient of determination

It's a way to compare competing models.

R squared: two same definitions with two different formulations:

  • R squares tells us the proportion of variance in the outcome measure that is explained by the predictors
  • or The predictor explains (R squared) percentage of the variance in the outcome measure.

If R Squared increases the models get better.

Example by adding multiple predictor if R Squared increased, we say that the model is boosted.

r squared tells the proportion of variance explained by a linear regression model, by a least squares model.


3 - Formula

<MATH> \begin{array}{rrl} R^2 & = & 1 - \frac{\href{RSS}{RSS}}{TSS} \\ TSS & = & \sum^N_{i=0} (y_i - \bar{y})^2 \\ \end{array} </MATH>

4 - Documentation / Reference