Statistics - Sampling Error

1 - About

The sampling error is the inaccuracy that results from estimating using a sample, rather than the entire population.

The Sampling error is the difference between the population and the sample.

Whenever a sample is used instead of the entire population, the results are merely estimates and therefore have some chance of being incorrect. This is called sampling error.

Sampling error implies that statistics will vary from one study to the next.

Sampling error is due to chance.

3 - Example

  • Even the random histogram is not perfectly random, there is some fluctuation due to sampling error.

  • From several samples of the same population, you will get a little bit of fluctuation that's just due to chance. That's sampling error
    • a mean of 12.03
    • a mean of 12.9
    • a mean of 14.13

4 - Calculation

Sampling error depends:

  • on the size of the sample, relative to the size of the population. As sample size increases, sampling error decreases
  • on the variance in the population. As variance increases, sampling error increases

Sampling error is estimated through the Standard error calculation.

5 - Sample Size

Sampling error, and therefore standard error, is largely determined by sample size.

If I have small samples relative to large populations, I'm going to have a large degree of sampling error.

The normal distribution when sample size is only 10 (first distribution in the upper left corner), is much much wider than when we have a really large sample (such as 1000 for the last distribution in the lower down corner).

That's because standard error is the standard deviation of this sampling distribution.

As the sample get larger, they're going to squeeze around the mean and the standard error is going to get very low.

This points to the problem of NSHT being bias by sample size.

Sample size drives and determine sampling error, influence the t-value and the p-value.

data_mining/sampling_error.txt · Last modified: 2017/09/13 16:04 by gerardnico