# R - K-means clustering

K-means in R.

## 3 - Steps

### 3.1 - Generate Data

K-means works in any dimension, but in two dimension, we can plot data.

### 3.2 - KMeans

Kmeans is in the stats package.

km.out=kmeans(xclustered,3,nstart=15)
km.out

where:

• 3 means that we search 3 cluster
K-means clustering with 3 clusters of sizes 33, 28, 39

Cluster means:
[,1]       [,2]
1 -1.107234 -6.7087012
2  1.803277 -0.0341333
3 -3.900942 -2.1654215

Clustering vector:
[1] 1 2 3 3 2 1 3 1 2 2 2 3 3 1 3 3 1 1 2 1 1 2 3 3 3 3 3 3 3 3 1 2 3 3 1 2 1 1 3 2 3 1 3 1 3 3 2 2 2 1 1 3 1 1
[55] 1 1 3 1 1 2 3 1 3 1 2 1 3 3 2 3 1 3 2 2 1 2 3 3 1 3 1 3 2 2 3 2 2 2 2 2 2 3 3 3 1 3 1 2 1 1

Within cluster sum of squares by cluster:
[1]  63.30585  54.25311 114.05169
(between_SS / total_SS =  84.5 %)

Available components:

[1] "cluster"      "centers"      "totss"        "withinss"     "tot.withinss" "betweenss"    "size"
[8] "iter"         "ifault"  

### 3.3 - Plot

Plot the data:

• with the colours of the kmeans cluster output (km.out$cluster) • a empty circle (pch=1) • growth (magnified) two times (cex=2) in order to add in it the points of the original generated grouping. plot(xclustered,col=km.out$cluster,pch=1,cex=2)

Add the points in its originally grouping

points(xclustered,col=which,pch=19)
points(xclustered,col=c(1,3,2)[which],pch=19)

where col=c(1,3,2)[which] will map the colour of the groups.