About

This section talks about the term Distribution also knows as Probability distribution where you get:

  • on the y axis, the probability
  • on the x axis, the event

They can be seen as the outcomes of a single experiment.

The term “Probability'' asserts that each value in the set of possible values have different probabilities of being seen when reading/seeing a random variable.

A probability distribution is a mathematical function that provides the probabilities of occurrence of different possible outcomes in an experiment.

In more technical terms, the probability distribution is a mathematical description of a random phenomenon (random variable?) in terms of the probabilities of events,

Many distributions are normal but not always. An histogram can help to find the type of distribution.

A box plot is a good summary of a distribution.

Discrete / Continuous

Discrete

There is two representation of a discrete distribution:

  • the Bayesian representation: A discrete distribution plots just discrete values to probabilities such that the probabilities add up to 1.
  • the frequentist representation. A infinite lists such that as n gets larger, sampling from the collection and counting the frequencies of each element approximates the Bayesian representation of the distribution.

Continuous

standard continuous distributions— such as Gaussian, beta, binomial, and uniform.

algebraic properties, called conjugate priors. For example, a uniform prior combined with a binomial likelihood results in a beta posterior.

Function

A distribution can be specified by supplying:

Possible duplicate: Mathematics - Probability distribution function

Characteristics

  • Mode: for a discrete random variable, the value with highest probability (the location at which the probability mass function has its peak); for a continuous random variable, the location at which the probability density function has its peak.
  • Support: the smallest closed set whose complement has probability zero.
  • Head: the range of values where the pmf or pdf is relatively high.
  • Tail: the complement of the head within the support; the large set of values where the pmf or pdf is relatively low.
  • Expected value or mean: the weighted average of the possible values, using their probabilities as their weights; or the continuous analog thereof.
  • Median: the value such that the set of values less than the median has a probability of one-half.
  • Statistics - (Variance|Dispersion|Mean Square) (MS): the second moment of the pmf or pdf about the mean; an important measure of the dispersion of the distribution.
  • Standard deviation: the square root of the variance, and hence another measure of dispersion.
  • Symmetry: a property of some distributions in which the portion of the distribution to the left of a specific value is a mirror image of the portion to its right.
  • Skewness: a measure of the extent to which a pmf or pdf “leans” to one side of its mean.

Type

Management

Comparison

A Q-Q plot compare two distributions.

Example with ggplot current/stat_qq.html

ggplot(res_succes, aes(sample=res_succes$TOTAL_TIME_SEC, colour = factor(res_succes$PRESENTATION_NAME))) +
  geom_point(stat = "qq", size=0.75)

Visualization

Track

Monitoring Metrics - Distribution Summary

Documentation / Reference