About
Steps
Prerequisites
require(MASS)
Model
- Fit the model
ldaModel=lda(Target~Variable1+Variable2,data=dataframe, subset=VariableN<10)
- Print it by tapping its name
ldaModel
Call:
lda(Target~ Variable1+ Variable2, data = dataframe, subset=VariableN<10)
Prior probabilities of groups:
False True
0.491984 0.508016
Group means:
Variable1 Variable2
False 0.04279022 0.03389409
True -0.03954635 -0.03132544
Coefficients of linear discriminants:
LD1
Variable1 -0.6420190
Variable2 -0.5135293
where:
- the prior probabilities are just the proportions of false and true in the data set. It's kind of a random walk. Half the time it goes up, half the time it goes down.
- the LDA coefficients. The LDA function fits a linear function for separating the two groups. Therefore, it's got two coefficients.
Plot
plot(ldaModel)
It plots a linear discriminant function separately, the values of the linear discriminant function, separately for the up group and the down group.
There's really not much difference.
Predictions and classification
predictions=predict(ldaModel,dataframe)
# It returns a list as you can see with this function
class(predictions)
# When you have a list of variables, and each of the variables have the same number of observations,
# a convenient way of looking at such a list is through data frame.
# Seeing the first 5 rows
data.frame(predictions)[1:5,]
class posterior.False posterior.True LD1
999 True 0.4901792 0.5098208 0.08293096
1000 True 0.4792185 0.5207815 0.59114102
1001 True 0.4668185 0.5331815 1.16723063
1002 True 0.4740011 0.5259989 0.83335022
1003 True 0.4927877 0.5072123 -0.03792892
where:
- the first column is the column name
- the class column is the classification
- the posterior probabilities for all the class
- the LDA coefficients
Accuracy
- Confusion Matrix
table(predictions$class,dataframe$target)
Down Up
Down 35 35
Up 76 106
- Current classification rate
mean(predictions$class==dataframe$target)
[1] 0.5595238