R - Simple Linear Regression

> Procedural Languages > R

1 - About

simple linear regression with R function such as lm

3 - Steps

3.1 - Linear Model

3.1.1 - Unstandardized

Unstandardized Simple Regression

myFit=lm(response~predictor,data=data.frame)
# Tilde means is modeled as.
myFit
Call:
lm(formula = response~predictor, data = data.frame)

Coefficients:
(Intercept)        predictor
      34.55        -0.95 
Advertising

3.1.2 - Standardized

Regression analyses, standardized (in the z scale). In simple regression, the standardized regression coefficient will be the same as the correlation

modelz <- lm(scale(data$OutcomeVariable) ~ scale(data$PredictorVariable))

3.2 - Model Attributes

attributes(myFit)

See R - Names

$names
 [1] "coefficients"  "residuals"     "effects"       "rank"         
 [5] "fitted.values" "assign"        "qr"            "df.residual"  
 [9] "xlevels"       "call"          "terms"         "model"        

$class
[1] "lm"

where:

3.3 - Summary

Summary Statistics

summary(myFit)
Call:
lm(formula = response ~ predictor, data = Boston)

Residuals:
    Min      1Q  Median      3Q     Max 
-15.168  -3.990  -1.318   2.034  24.500 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 34.55384    0.56263   61.41   <2e-16 ***
predictor   -0.95005    0.03873  -24.53   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 6.216 on 504 degrees of freedom
Multiple R-squared:  0.5441,	Adjusted R-squared:  0.5432 
F-statistic: 601.6 on 1 and 504 DF,  p-value: < 2.2e-16
Advertising

3.4 - Confidence Interval

Confidence Interval

confint(myFit)
                2.5 %     97.5 %
(Intercept) 33.448457 35.6592247
predictor   -1.026148 -0.8739505

3.5 - Prediction

predict(fit1,data.frame(predictor=c(5,10,15)),interval="confidence")
       fit      lwr      upr
1 29.80359 29.00741 30.59978
2 25.05335 24.47413 25.63256
3 20.30310 19.73159 20.87461

3.6 - Plot

3.6.1 - The data

# the data
plot(response~predictor,data.frame)
# or
plot(data$OutcomeVariable ~ data$PredictorVariable, main = "Scatterplot", ylab = "PredictorVariable", xlab = "OutcomeVariable")
  • The regression line
# the fit line 
abline(myFit,col="red")

3.6.2 - The predicted scores

data$PredictedScore = fitted(model)
plot(data$OutcomeVariable ~ data$PredictedScore, main = "Scatterplot", ylab = "OutcomeVariable", xlab = "Predicted Scores")
abline(lm(data$OutcomeVariable ~ data$PredictedScore), col="blue")
Advertising

3.6.3 - The residuals

data$e <- resid(model)
hist(data$e)
plot(data$predicted ~ data$e, main = "Scatterplot", ylab = "Predicted Scores", xlab = "Residuals")
abline(lm(PE$predicted ~ PE$e), col="blue")
lang/r/simple_regression.txt · Last modified: 2017/11/23 20:34 by gerardnico