Data Mining - High Dimension (Curse of Dimensionality)

Thomas Bayes

About

High dimension

In high dimension, it's really difficult to stay local.

In high dimensions, all cases are edge cases

Sam Ross

Example

See this interactive app in R Shiny on the wiki/Curse of Dimensionality.

Circle example: The circle fills up most of the area in the square, in fact it takes up exactly <math>\pi</math> out of 4 which is about 78%. In three dimensions we have a sphere and a cube, and the ratio of sphere volume to cube volume is a bit smaller, <math>\frac{4\pi}{3}</math> out of a total of 8, which is just over 52%





Discover More
Thomas Bayes
Data Mining - Dimensionality (number of variable, parameter) (P)

Not to confound with d: the model size. You may have 1000 attributes (p=1000) in your sample but after feature selection for instance, you model may use only a handful (d=5) In physics and mathematics,...
Thomas Bayes
Data Mining - Rare Event

A rare event is always rare in function of the population being studied. high dimensions The rate of an event is related to the probability of an event occurring in some small subinterval (of time, space...



Share this page:
Follow us:
Task Runner