R - Simple Linear Regression

Card Puncher Data Processing

About

simple linear regression with R function such as lm

Steps

Linear Model

Unstandardized

Unstandardized Simple Regression

myFit=lm(response~predictor,data=data.frame)
# Tilde means is modeled as.
myFit
Call:
lm(formula = response~predictor, data = data.frame)

Coefficients:
(Intercept)        predictor
      34.55        -0.95 

Standardized

Regression analyses, standardized (in the z scale). In simple regression, the standardized regression coefficient will be the same as the correlation

modelz <- lm(scale(data$OutcomeVariable) ~ scale(data$PredictorVariable))

Model Attributes

attributes(myFit)

See R - Names

$names
 [1] "coefficients"  "residuals"     "effects"       "rank"         
 [5] "fitted.values" "assign"        "qr"            "df.residual"  
 [9] "xlevels"       "call"          "terms"         "model"        

$class
[1] "lm"

where:

Summary

Summary Statistics

summary(myFit)
Call:
lm(formula = response ~ predictor, data = Boston)

Residuals:
    Min      1Q  Median      3Q     Max 
-15.168  -3.990  -1.318   2.034  24.500 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 34.55384    0.56263   61.41   <2e-16 ***
predictor   -0.95005    0.03873  -24.53   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 6.216 on 504 degrees of freedom
Multiple R-squared:  0.5441,	Adjusted R-squared:  0.5432 
F-statistic: 601.6 on 1 and 504 DF,  p-value: < 2.2e-16

Confidence Interval

Confidence Interval

confint(myFit)
2.5 %     97.5 %
(Intercept) 33.448457 35.6592247
predictor   -1.026148 -0.8739505

Prediction

predict(fit1,data.frame(predictor=c(5,10,15)),interval="confidence")
fit      lwr      upr
1 29.80359 29.00741 30.59978
2 25.05335 24.47413 25.63256
3 20.30310 19.73159 20.87461

Plot

The data

# the data
plot(response~predictor,data.frame)
# or
plot(data$OutcomeVariable ~ data$PredictorVariable, main = "Scatterplot", ylab = "PredictorVariable", xlab = "OutcomeVariable")
  • The regression line
# the fit line 
abline(myFit,col="red")

The predicted scores

data$PredictedScore = fitted(model)
plot(data$OutcomeVariable ~ data$PredictedScore, main = "Scatterplot", ylab = "OutcomeVariable", xlab = "Predicted Scores")
abline(lm(data$OutcomeVariable ~ data$PredictedScore), col="blue")

The residuals

data$e <- resid(model)
hist(data$e)
plot(data$predicted ~ data$e, main = "Scatterplot", ylab = "Predicted Scores", xlab = "Residuals")
abline(lm(PE$predicted ~ PE$e), col="blue")





Discover More
Card Puncher Data Processing
R - Tilde Operator

A formula in S is indicated by the tilde character. The Tilde operator separate the left- and right-hand sides in a model formula. where: The left hand is a dependent variable) The right hand is...



Share this page:
Follow us:
Task Runner