Bonferroni Correction

From GM-RKB
Jump to navigation Jump to search

A Bonferroni Correction is a post-hoc multiple comparison procedure for estimating confidence intervals based on the hypothesis that the marginal distributions are continuous functions.



References

2016

[...] The Bonferroni correction is based on the idea that if an experimenter is testing [math]\displaystyle{ m }[/math] hypotheses, then one way of maintaining the familywise error rate (FWER) is to test each individual hypothesis at a statistical significance level of [math]\displaystyle{ 1/m }[/math] times the desired maximum overall level.
If the desired significance level for the whole family of tests is [math]\displaystyle{ \alpha }[/math], then the Bonferroni correction would test each individual hypothesis at a significance level of [math]\displaystyle{ \alpha/m }[/math].For example, if a trial is testing [math]\displaystyle{ m = 8 }[/math] hypotheses with a desired [math]\displaystyle{ \alpha = 0.05 }[/math], then the Bonferroni correction would test each individual hypothesis at [math]\displaystyle{ \alpha = 0.05/8 = 0.00625 }[/math].
[...] Let [math]\displaystyle{ H_{1},...,H_{m} }[/math] be a family of hypotheses and [math]\displaystyle{ p_{1},...,p_{m} }[/math] their corresponding p-values. The familywise error rate (FWER) is the probability of rejecting at least one true [math]\displaystyle{ H_{i} }[/math]; that is, to make at least one type I error. The Bonferroni correction states that rejecting the null hypothesis for all [math]\displaystyle{ p_{i}\leq\frac{\alpha}{m} }[/math] controls the FWER. The proof follows from Boole's inequality:
[math]\displaystyle{ FWER = P\left\{ \bigcup_{i=1}^{m_0}\left(p_{i}\leq\frac{\alpha}{m}\right)\right\} \leq\sum_{i=1}^{m_0}\left\{P\left(p_{i}\leq\frac{\alpha}{m}\right)\right\}\leq m_{0}\frac{\alpha}{m}\leq m\frac{\alpha}{m}=\alpha }[/math]
This control does not require any assumptions about dependence among the p-values.

1959

  • (Dunn, 1959) ⇒ Dunn, O. J. (1959). Estimation of the medians for dependent variables. The Annals of Mathematical Statistics, 192-197. doi:10.1214/aoms/1177706374
    • The problem considered in this paper is that of using a non-parametric method to estimate by a confidence set the unknown medians of two dependent variables. In various types of research, it is convenient to consider a sample of n individuals and to take measurements on the same n individuals at two different times or at two different levels of treatment. The two measurements on the same individual cannot be assumed to be independent, so that it is appropriate to consider the 2n measurements as a sample of size n from a bivariate distribution. Let the two variables , [math]\displaystyle{ y_1,\;y_2 }[/math] with medians [math]\displaystyle{ \nu_1,\;\nu_2 }[/math] have the c.d.f. (cumulative distribution function) [math]\displaystyle{ F(y_1,y_2) }[/math]. By a set of simultaneous confidence intervals of bounded confidence level [math]\displaystyle{ 1 - \alpha }[/math] for [math]\displaystyle{ \nu_1,\;\nu_2 }[/math] is meant a set of four functions of the sample values, say [math]\displaystyle{ g_1,g_2, h_1,h_2 }[/math] such that
[math]\displaystyle{ P(g_1 \lt \nu_1 \lt h_1, g_2 \lt \nu_2\lt h_2) \geq 1 — \alpha }[/math]
The probability relationship must hold for all underlying distributions in a specified set of distributions. In this paper the specified set will consist of all bivariate distributions whose marginals have continuous c.d.f.'s. The method used in this paper to obtain confidence intervals uses order statistics and requires only the assumption that the marginal distributions be continuous.