Matched-Pair t-Statistic

From GM-RKB
Jump to navigation Jump to search

A Matched-Pair t-Statistic is a t-statistic that is based on paired population samples, i.e. the sampled datasets are not independent.

[math]\displaystyle{ t = \frac{\bar{d} - D}{s_d/\sqrt{n}} }[/math]
where [math]\displaystyle{ \bar{d} }[/math] is the mean difference between matched pairs in the sample, D is the hypothesized difference between population means, [math]\displaystyle{ s_d }[/math] the standard deviation of the differences for each matched pair.


References

2016

  • (PSU, 2016) ⇒ https://onlinecourses.science.psu.edu/stat414/node/271 Retrieved 2016-10-16
    • QUOTE: (...) we can "remove" the dependence between X and Y by subtracting the two measurements [math]\displaystyle{ X_i }[/math] and [math]\displaystyle{ Y_i }[/math] for each pair of twins i, that is, by considering the independent measurements
[math]\displaystyle{ D_i=X_i−Y_i }[/math]
Then, our null hypothesis involves just a single mean, which we'll denote [math]\displaystyle{ \mu_D }[/math], the mean of the differences:

[math]\displaystyle{ H_0=\mu_D=μ_X−μ_Y=0 }[/math]

(...) our measurements are differences [math]\displaystyle{ d_i }[/math] whose mean is [math]\displaystyle{ \bar{d} }[/math] and standard deviation is [math]\displaystyle{ s_D }[/math]. That is, when testing the null hypothesis [math]\displaystyle{ H0:\mu_D=\mu_0 }[/math]against any of the alternative hypotheses [math]\displaystyle{ HA:\mu_D\ne \mu_0 \;, HA:\mu_D\lt \mu_0 }[/math] and [math]\displaystyle{ HA:μ_D\gt μ_0 }[/math], we compare the test statistic:
[math]\displaystyle{ t=\frac{\bar{d}-\mu_0}{S_D/\sqrt{n}} }[/math]
to a t-distribution with n−1 degrees of freedom.


The matched-pair t-test (or paired t-test or paired samples t-test or dependent t-test) is used when the data from the two groups can be presented in pairs, for example where the same people are being measured in before-and-after comparison or when the group is given two different tests at different times (eg. pleasantness of two different types of chocolate)


The sampling method for each sample is simple random sampling.
The test is conducted on paired data. (As a result, the data sets are not independent.)
The sampling distribution is approximately normal, which is generally true if any of the following conditions apply.
The population distribution is normal.
The population data are symmetric, unimodal, without outliers, and the sample size is 15 or less.
The population data are slightly skewed, unimodal, without outliers, and the sample size is 16 to 40.
The sample size is greater than 40, without outliers.