Statistical Significance Measure

Jump to: navigation, search

A Statistical Significance Measure is a measure of whether an empirical distribution was created by a random process.



  • (Wikipedia, 2011) ⇒
    • In statistics, a result is called statistically significant if it is unlikely to have occurred by chance. The phrase test of significance was coined by Ronald Fisher.[1]

      As used in statistics, significant does not mean important or meaningful, as it does in everyday speech. For example, a study that included tens of thousands of participants might be able to say with great confidence that residents of one city were more intelligent than residents of another city by 1/20 of an IQ point. This result would be statistically significant, but the difference is small enough to be utterly unimportant. Conversely, not everything which is meaningful will be statistically significant. Research analysts who focus solely on significant results may miss important response patterns which individually may fall under the threshold set for tests of significance. Many researchers urge that tests of significance should always be accompanied by effect-size statistics, which approximate the size and thus the practical importance of the difference.

      The amount of evidence required to accept that an event is unlikely to have arisen by chance is known as the significance level or critical p-value: in traditional Fisherian statistical hypothesis testing, the p-value is the probability of observing data at least as extreme as that observed, given that the null hypothesis is true. If the obtained p-value is small then it can be said either the null hypothesis is false or an unusual event has occurred. It is worth stressing that p-values do not have any repeat sampling interpretation.

      An alternative statistical hypothesis testing framework is the Neyman–Pearson frequentist school which requires both a null and an alternative hypothesis to be defined and investigates the repeat sampling properties of the procedure, i.e. the probability that a decision to reject the null hypothesis will be made when it is in fact true and should not have been rejected (this is called a "false positive" or Type I error) and the probability that a decision will be made to accept the null hypothesis when it is in fact false (Type II error).

      More typically, the significance level of a test is such that the probability of mistakenly rejecting the null hypothesis is no more than the stated probability. This allows the test to be performed using non-significant statistics which has the advantage of reducing the computational burden while wasting some information.

  1. "Critical tests of this kind may be called tests of significance, and when such tests are available we may discover whether a second sample is or is not significantly different from the first." — R. A. Fisher (1925). Statistical Methods for Research Workers, Edinburgh: Oliver and Boyd, 1925, p.43.