# R - K-Nearest Neighbors (KNN) Analysis

## 2 - Steps

### 2.1 - Prerequisites

`library(class)`

### 2.2 - Syntax

```?knn
knn(train, test, cl, k = 1, l = 0, prob = FALSE, use.all = TRUE)```

where:

• k is number of neighbours to be considered.
• train is the training set
• c1 is the factor of the training set with the true target
• test is the test set

### 2.3 - Training and Test Data set

• The knn function is waiting for two matrix (a training set and a test set)
```# To be able to call all data frame variables by names
attach(myDataFrame)

#  Make a matrix of the chosen variables variable1 and variable1
variables=cbind(variable1,variable2)

# Make an indicator (a vector of true or false)
indicator=variableName<10

# The training set will be then
variables[indicator,]
# And the test set will be:
variables[!indicator,]```

### 2.4 - Model

Call to the knn function to made a model

`knnModel=knn(variables[indicator,],variables[!indicator,],target[indicator]],k=1)`

To classify a new observation, knn goes into the training set in the x space, the feature space, and looks for the training observation that's closest to your test point in Euclidean distance and classify it to this class.

### 2.5 - Accuracy

#### 2.5.1 - Confusion Matrix

`table(knnModel,variables[!indicator])`
```knnModel  False True
False    43   58
True     68   83```

#### 2.5.2 - Mean

`mean(knnModel==variables[!indicator])`
` 0.5`

It was useless as One nearest neighbor did no better than flipping a coin.