Statistical Tests

Statistical tests are conducted to accept or reject hypothesis. They can be divided into tests catering to specific distributions of data.

  • Parametric Test : used for normally distributed data (assess group means)
  • Non-Parametric Test : used for skewed data (assess group medians)

The tests are also classified according to their tails.

  • One Tailed : uni-directional (typing speed increases with more typing)
  • Two tailed: bi-directional (typing speed can increase or decrease with more typing)

โš  P-value is the probability of the sample means coming up equal to or even further away from the hypothesized population mean.

z-Test (sample size > 30)

Statistical calculation used to compare sample mean to population mean.
Most useful when standard deviation and the sample size is known.

The z score tells how far, in standard deviations, a data-point is from the mean.

t-Test (all sample size also referred as sample size < 30 )

To determine if there is a statistically significant difference between two sample group means.
Used when population standard deviation is unknown.

The t-score is a ratio between the difference between the two groups and the difference within the groups

(A large t-score tells you that the groups are different. A small t-score tells you that the groups are similar)

- Paired samples t-Test is used when there is a before - after ideology (matched)
- Independent samples t-Test (Equal Variances/Unequal Variances) is also known as a regular t-Test

ANOVA

It is basically an extension to t test used when more than 2 samples are to be compared.
It is used to determine whether there are any statistically significant differences between the means of three or more independent groups.
ANOVA uses categorical independent variables and a continuous dependent variable

H0 : all the means of the groups are NOT statistically different (All samples are from the same population)
HA : at least two group means are statistically different from each other. (At least one sample comes from a different population)
F = (Between Group Variability) รท (Within Group Variability)

Chi-square

To test dependency or independency of categorical variables. Are 2 categorical variables independent
Anuj says it checks the difference in means + variance
Observed vs expected

F test

It’s a significance of variance test used for model evaluation. It is a one way ANOVA.
It is basically the ratio of 2 chi square tests.

ย 


| When to use what test | Cheatsheet Python tests |