Statistics - Akaike information criterion (AIC)

Thomas Bayes

Statistics - Akaike information criterion (AIC)

About

AIC stands for Akaike Information Criterion.

Akaike is the name of the guy who came up with this idea.

AIC is a quantity that we can calculate for many different model types, not just linear models, but also classification model such logistic regression and so on.

Definition

The AIC criterion is defi ned for a large class of models fi t by maximum likelihood:

<MATH> AIC = -2 log L + 2 . d </MATH>

where:

  • L is the maximized value of the likelihood function for the estimated model.
  • d is the total # of parameters used in the model (regression coefficients + intercept)

Linear model

It turns out that in the case of a linear model with Gaussian errors, negative 2 log L is just equal to RSS over <math>\hat{\sigma}</math> squared

<MATH> 2 log L = \frac{\href{RSS}{RSS}}{\href{variance}{\hat{\sigma}}^2} </MATH>

where:

Then by plugging it in the aboe formula, we can see that AIC and Mallow's Cp are actually proportional to each other. They are the same thing for linear models.





Discover More
Thomas Bayes
Data Mining - (Test|Expected|Generalization) Error

Test error is the prediction error that we incur on new data. The test error is actually how well we'll do on future data the model hasn't seen. The test error is the average error that results from using...
Bed Overfitting
Machine Learning - (Overfitting|Overtraining|Robust|Generalization) (Underfitting)

A learning algorithm is said to overfit if it is: more accurate in fitting known data (ie training data) (hindsight) but less accurate in predicting new data (ie test data) (foresight) Ie the model...
Thomas Bayes
Statistics - Adjusted R^2

A big R squared indicates a model that really fits the data well. But unfortunately, you can't compare models of different sizes by just taking the one with the biggest R squared because you can't compare...
Thomas Bayes
Statistics - Bayesian Information Criterion (BIC)

BIC is like AIC and Mallow's Cp, but it comes from a Bayesian argument. The formulas are very similar. The formula calculate the residual sum of squares and then add anadjustment term which is...
Subset Selection Model Path
Statistics - Model Selection

Model selection is the task of selecting a statistical model from a set of candidate models through the use of criteria's Dimension reduction procedures generates and returns a sequence of possible...
Time Serie - Seasonality (Cycle detection)

Seasonality is a cycle in time serie. The season is used a discreet regression variable and code it as dummy variables). For instance: 1 if it's the season 0 if it's not the season If you...



Share this page:
Follow us:
Task Runner