About
simple linear regression with R function such as lm
Articles Related
Steps
Linear Model
Unstandardized
Unstandardized Simple Regression
myFit=lm(response~predictor,data=data.frame)
# Tilde means is modeled as.
myFit
Call:
lm(formula = response~predictor, data = data.frame)
Coefficients:
(Intercept) predictor
34.55 -0.95
Standardized
Regression analyses, standardized (in the z scale). In simple regression, the standardized regression coefficient will be the same as the correlation
modelz <- lm(scale(data$OutcomeVariable) ~ scale(data$PredictorVariable))
Model Attributes
attributes(myFit)
See R - Names
$names
[1] "coefficients" "residuals" "effects" "rank"
[5] "fitted.values" "assign" "qr" "df.residual"
[9] "xlevels" "call" "terms" "model"
$class
[1] "lm"
where:
- coefficient are the regression coefficient
- residuals are the residuals
- effects are the effects
- …
Summary
summary(myFit)
Call:
lm(formula = response ~ predictor, data = Boston)
Residuals:
Min 1Q Median 3Q Max
-15.168 -3.990 -1.318 2.034 24.500
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 34.55384 0.56263 61.41 <2e-16 ***
predictor -0.95005 0.03873 -24.53 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 6.216 on 504 degrees of freedom
Multiple R-squared: 0.5441, Adjusted R-squared: 0.5432
F-statistic: 601.6 on 1 and 504 DF, p-value: < 2.2e-16
Confidence Interval
confint(myFit)
2.5 % 97.5 %
(Intercept) 33.448457 35.6592247
predictor -1.026148 -0.8739505
Prediction
predict(fit1,data.frame(predictor=c(5,10,15)),interval="confidence")
fit lwr upr
1 29.80359 29.00741 30.59978
2 25.05335 24.47413 25.63256
3 20.30310 19.73159 20.87461
Plot
The data
# the data
plot(response~predictor,data.frame)
# or
plot(data$OutcomeVariable ~ data$PredictorVariable, main = "Scatterplot", ylab = "PredictorVariable", xlab = "OutcomeVariable")
- The regression line
# the fit line
abline(myFit,col="red")
The predicted scores
- Get the predicted scores as a new variable and plot them
data$PredictedScore = fitted(model)
plot(data$OutcomeVariable ~ data$PredictedScore, main = "Scatterplot", ylab = "OutcomeVariable", xlab = "Predicted Scores")
abline(lm(data$OutcomeVariable ~ data$PredictedScore), col="blue")
The residuals
- Get the residuals and plot them
data$e <- resid(model)
hist(data$e)
plot(data$predicted ~ data$e, main = "Scatterplot", ylab = "Predicted Scores", xlab = "Residuals")
abline(lm(PE$predicted ~ PE$e), col="blue")