Wilcoxon Signed-Rank Test

(Redirected from Wilcoxon signed-rank test)
Jump to: navigation, search

A Wilcoxon Signed-Rank Test is a non-parametric test equivalent to the one-sample t-test and matched-pair t-test.



  • (ITL-SED, 2017) ⇒ Retrieved 2017-01-08 from NIST (National Intitute of Standards and Technology, US) website http://www.itl.nist.gov/div898//software/dataplot/refman1/auxillar/signrank.htm
    • The t-test is the standard test for testing that the difference between population means for two paired samples are equal. If the populations are non-normal, particularly for small samples, then the t-test may not be valid. The signed rank test is an alternative that can be applied when distributional assumptions are suspect. However, it is not as powerful as the t-test when the distributional assumptions are in fact valid.
The signed rank test is also commonly called the Wilcoxon signed rank test or simply the Wilcoxon test.
To form the signed rank test, compute [math]d_i = X_i - Y_i[/math] where [math]X[/math] and [math]Y[/math] are the two samples. Rank the [math]d_i[/math] without regard to sign. Tied values are not included in the Wilcoxon test. After ranking, restore the sign (plus or minus) to the ranks. Then compute W+ and W- as the sums of the positive and negative ranks respectively. If the two population means are in fact equal, then the sums of the ranks should also be nearly equal. If the difference between the sum of the ranks is too great, we reject the null hypothesis that the population means are equal.
Significance levels are based on the fact that if there is no difference in the population means, then there are [math]2^n[/math] equally likely ways for the n ranks to recieve signs.
More formally, the hypothesis test is defined as follows.
[math]H_0:\quad \mu_1=\mu_2[/math]
[math]Ha:\quad \mu_1 \ne \mu_2[/math]
Test Statistic: W=MIN(W-,W+) where the computation of W- and W+ is discussed above.
Significance Level: [math]\alpha[/math] (typically set to .05). Due to the discreteness of the ranks, the actual significance level will not in most cases be exact.
Critical Region: For small samples (N ≤ 30), the critical regions have been tabulated. For N > 30, the test statistic W approaches a normal distribution with a mean of
and a standard deviation of
The critical regions are thus based on the normal percent point function. That is, for a 2-sided test,
[math]\mu_w−\sigma_w\phi^{−1}(\alpha/2) \lt W \lt \mu_w+\sigma_w\phi^{−1}(\alpha/2)[/math]
where [math]\mu_w[/math] and [math]\sigma_w[/math] are the mean and standard deviation of W as described above and [math]\phi^{−1}[/math] is the normal percent point function.
Conclusion: Reject null hypothesis if test statistic is in critical region
Although the above discussion was in terms of a paired two sample test, it can easily be adapted to the following additional cases:



(1) the random variable X is continuous
(2) the probablility density function of X is symmetric
Then, upon taking a random sample [math]X1, X2, \cdots , Xn,[/math] we are interested in testing the null hypothesis: [math]H_0: m=m_0 [/math] against any of the possible alternative hypotheses: [math]H_A: \;\; m \;\gt \;m_0 \text{ or } H_A:\;\;m\;\lt \;m_0 \text{ or } H_A:\;\;m\;\ne\;m_0[/math]


Null hypothesis - The null hypothesis is that the median difference between pairs of observations is zero. Note that this is different from the null hypothesis of the paired t–test, which is that the mean difference between pairs is zero, or the null hypothesis of the sign test, which is that the numbers of differences in each direction are equal.


  • (Rosie Shier, 2004) ⇒ Statistics: 2.2 The Wilcoxon signed rank sum test http://www.statstutor.ac.uk/resources/uploaded/wilcoxonsignedranktest.pdf
    • The Wilcoxon signed rank sum test is another example of a non-parametric or distribution free test (see 2.1 The Sign Test). As for the sign test, the Wilcoxon signed rank sum test is used is used to test the null hypothesis that the median of a distribution is equal to some value. It can be used a) in place of a one-sample t-test b) in place of a paired t-test or c) for ordered categorial data where a numerical scale is inappropriate but where it is possible to rank the observations.