Statistics

About

Statistics is a scientific discipline devoted to the study of data.

Statistics is the art of extracting information from data.

From Data to Information to Knowledge.

No learning.

There are three kinds of lies:

  • lies,
  • damned lies,
  • and statistics.

Facts are stubborn things, but statistics are more pliable.

Concept

Statisticians refer to the entire group that is being studied as a population

Each member of the population is called a unit.

A statistician studying a population would be interested in collecting information about different characteristics of the unit. Those characteristics are called variables.

Most of the time, it is extremely difficult or very costly to collect all the information about a population. Because of these, it is common to use a smaller, representative group from the population called a sample.

In statistics, the actual number of the population is called a parameter.

The number of unit in the sample, or any other number that describes the individuals in the sample (like their length, or weight, or age), is called a statistic. In general, each statistic is an estimate of a parameter, whose value is not known exactly.

In general, the potential difference between the true parameter and the statistic obtained from using a sample is called sampling error.

The sample could have chosen in an area where a large number of tortoise tend to congregate (near a food or water source perhaps). If this sample were used to estimate the number of tortoises in all locations, it may lead to population estimate that is too high. This type of systematic error in sampling is called bias.

Type

Descriptive

Descriptive statistics: procedures used to summarize, organize, and simplify data

E.g., Median – describes data but can’t be generalized beyond that

Inferential

Inferential statistics : procedures that allow for generalizations about population parameters based on sample statistics

E.g., t-test – enables inferences about population beyond our data

Parametric

Statistics - (Non) Parametrics (method|statistics)

Method

Approach

Approach to Statistics

Frequentist

P(D|H) 

Probability of seeing this data, given the (null) hypothesis

Bayesian

P(H|D)

Probability of a given outcome, given this data

Data Analyse Techniques

Data Analyse Techniques such as:

Type of study

Statisticians and researchers use two main techniques to form important conclusions about the relationships between variables.

  • An observational study is when a researcher observes the subjects in the real world without manipulating them. A longitudinal study is a long-term observational study in which the same group of subjects is observed for very long periods of time
  • An experiment is an effort to establish cause-and-effect relationships where the researcher imposes a treatment on a group of subjects.

Documentation / Reference

Task Runner