# R - Multiple Linear Regression

Multiple linear regression with R functions such as lm

## 3 - Steps

### 3.1 - Linear Model

#### 3.1.1 - Unstandardized

```myFit=lm(response~predictor1+predictor2,data=data.frame)
myFit```
```Call:
lm(formula = response~ predictor1 + predictor2, data = data.frame)

Coefficients:
(Intercept)   predictor1   predictor2
33.22276     -1.03207      0.03454  ```

#### 3.1.2 - Standardized

Regression analyses, standardized (in the z scale).

`modelz <- lm(scale(data\$OutcomeVariable) ~ scale(data\$PredictorVariable1) + scale(data\$PredictorVariable2) )`

#### 3.1.3 - All variables are predictors

`myFit=lm(outcome~.,DataFrame)`

The point is a short-cut to select all variables.

#### 3.1.4 - Updating a model

Updating a model to remove the non-significant predictors.

`myModel1=update(myModel1,~.-NoSignificantPredictor1-NoSignificantPredictor1)`

### 3.2 - Model Attributes

`attributes(myFit)`

See R - Names

```\$names
 "coefficients"  "residuals"     "effects"       "rank"
 "fitted.values" "assign"        "qr"            "df.residual"
 "xlevels"       "call"          "terms"         "model"

\$class
 "lm"```

where:

### 3.3 - Summary

`summary(myFit)`
```Call:
lm(formula = response~ predictor1 + predictor2, data = data.frame)

Residuals:
Min      1Q  Median      3Q     Max
-15.981  -3.978  -1.283   1.968  23.158

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 33.22276    0.73085  45.458  < 2e-16 ***
predictor1  -1.03207    0.04819 -21.416  < 2e-16 ***
predictor2   0.03454    0.01223   2.826  0.00491 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 6.173 on 503 degrees of freedom
Multiple R-squared:  0.5513,	Adjusted R-squared:  0.5495
F-statistic:   309 on 2 and 503 DF,  p-value: < 2.2e-16```

where:

### 3.4 - Confidence Interval

`confint(myFit)`
```                  2.5 %      97.5 %
(Intercept) 31.78687150 34.65864956
predictor1  -1.12674848 -0.93738865
predictor2   0.01052507  0.05856361```

### 3.5 - Prediction

`predict(fit2,data.frame(predictor1=c(5,10,15),predictor2=c(20,30,40)),interval="confidence")`
```       fit      lwr      upr
1 28.75330 27.67694 29.82967
2 23.93841 22.97305 24.90376
3 19.12351 18.12607 20.12094```

### 3.6 - Plot

```# A two by two frames to receive the scatter-plots
par(mfrow=c(2,2))
# Plot
plot(model)```

Plot gives various views of the linear model:

• Residuals against the fitted values.

The goal is to capture non-linearities. If we see a curve in the residuals, it means that the model is not quite capturing everything that's going on because of some non-linearity effect.

• Normal QQ
• Scale Location
• Residuals vs Leverage