Statistics - Assumptions underlying correlation and regression analysis (Never trust summary statistics alone)
Table of Contents
1 - About
The magnitude of a correlation depends upon many factors, including:
2 - Articles Related
3 - Anscombe's quartet
The below scatter-plots have the same correlation coefficient and thus the same regression line.
<MATH> Y = 3 + 0.5 X </MATH>
Only the first one on the upper left satisfies the assumptions underlying a:
4 - Datasaurus: Never trust summary statistics alone; always visualize your data
The Datasaurus Dozen. While different in appearance, each dataset has the same summary statistics (mean, standard deviation, and Pearson's correlation) to two decimal places.
5 - How to
5.1 - test the assumptions in a regression analysis ?
If the assumptions are good, there must be:
- no relationship between X and the residual. They must be independent. The relation coefficient must be zero.
- some of the points above zero and some of them below zero. It will indicate Homoscedasticity