Preprocessing

Clean and merge dataset

Bonferroni T-test

Welch T-test

Welch t-test is done before to apply the Bonferroni approach.

It is used to investigate the significance of the difference between the means of 2 populations.

Welch's T-test is a two-sample location test which is used to test the hypothesis that 2 populations have equal means. It is an adaptation of Student's t-test and is more reliable when two samples have unequal variances and/or unequal sample sizes (our case).

These tests are often referred to as "unpaired" or "independent samples" t-tests, as they are typically applied when the statistical units underlying the two samples being compared are non-overlapping.

$$ t = \frac{\bar{X_1} - \bar{X_2}}{\sqrt{\frac{s^2_1}{N_1} + \frac{s^2_2}{N_2}}} $$

Unlike the Student's T-test the denominator is not base on the pooled variance estimate.

Bonferroni

Siano $H_1, ... H_n$una famiglia di ipotesi e $p_1, ... p_n$ i loro corrispettivi valori $p$. Siano $m$ il numero totale di ipotesi nulle e $m_0$ il numero di ipotesi nulle vere. Il tasso di errore familiare (FWER) è la probabilità di rifiutare almeno una $H_i$ vera, cioè di commettere almeno un errore di tipo I (falso positivo).

La correlazione di Bonferroni respinge l'ipotesi nulla per ciascun

$$ p_i \le \frac{\alpha}{m} $$

controllando in tal modo il $FWER \le \alpha$.