R - K-Nearest Neighbors (KNN) Analysis

> Procedural Languages > R

1 - About

2 - Steps

2.1 - Prerequisites

2.2 - Syntax

knn(train, test, cl, k = 1, l = 0, prob = FALSE, use.all = TRUE)


  • k is number of neighbours to be considered.
  • train is the training set
  • c1 is the factor of the training set with the true target
  • test is the test set

2.3 - Training and Test Data set

  • The knn function is waiting for two matrix (a training set and a test set)
# To be able to call all data frame variables by names

#  Make a matrix of the chosen variables variable1 and variable1

# Make an indicator (a vector of true or false)

# The training set will be then
# And the test set will be:

2.4 - Model

Call to the knn function to made a model


To classify a new observation, knn goes into the training set in the x space, the feature space, and looks for the training observation that's closest to your test point in Euclidean distance and classify it to this class.

2.5 - Accuracy

2.5.1 - Confusion Matrix

knnModel  False True
    False    43   58
    True     68   83

2.5.2 - Mean

[1] 0.5

It was useless as One nearest neighbor did no better than flipping a coin.


2.6 - Next

We could proceed further and try nearest neighbors with multiple values of k.