# Data Mining - Cosine Similarity (Measure of Angle)

The cosine similarity is a measure of the angle between two vectors, normalized by magnitude. You just divide the dot product by the magnitude of the two vectors.

## 3 - Formula

By taking the algebraic and geometric definition of the dot product, we get the cosine similarity that is a normalized dot product of two vectors $$similarity = \cos \theta = \frac{a.b}{||a|| ||b||} = \frac{ \sum a_i b_i }{ \sqrt{\sum a_i^2} \sqrt{\sum b_i^2} }$$

• If the angle is small (they share many tokens in common), the cosine is large.
• If the angle is large (and they have few tokens in common), the cosine is small.