Independent Two-Sample t-Test Task

From GM-RKB
Jump to navigation Jump to search

An Independent Two-Sample t-Test Task is a statistical hypothesis testing task used to describe an independent two-sample t-test.

  • Sample 1/Group 1: female_viq dataset corresponds VIQ values for Gender labelled as "Female",
  • Sample 2/Group 2: male_viq dataset corresponds VIQ values for Gender labelled as "Male".
Then, it uses an independent two-sample t-test algorithm, stats.ttest_ind(female_viq, male_viq) to calculate the independent two-sample t-test statistic, [math]\displaystyle{ t=-0.77261617232 }[/math] and p-value [math]\displaystyle{ p=0.4445287677858 }[/math]. If we consider a significance level [math]\displaystyle{ \alpha=0.05 }[/math] the test fails to reject the null hypothesis.


References

2017a

  • the two sample sizes (that is, the number, n, of participants of each group) are equal;
  • it can be assumed that the two distributions have the same variance;
Violations of these assumptions are discussed below.
The t statistic to test whether the means are different can be calculated as follows:
[math]\displaystyle{ t = \frac{\bar {X}_1 - \bar{X}_2}{s_p \sqrt{2/n}} }[/math]
where
[math]\displaystyle{ \ s_p = \sqrt{\frac{s_{X_1}^2+s_{X_2}^2}{2}} }[/math]
Here [math]\displaystyle{ s_p }[/math] is the pooled standard deviation for n=n1=n2 and [math]\displaystyle{ s_{X_1}^2 }[/math] and [math]\displaystyle{ s_{X_2}^2 }[/math] are the unbiased estimators of the variances of the two samples. The denominator of t is the standard error of the difference between two means.
For significance testing, the degrees of freedom for this test is 2n − 2 where n is the number of participants in each group.

2017b

  • (Stattrek, 2017) ⇒ http://stattrek.com/hypothesis-test/difference-in-means.aspx?Tutorial=AP
    • This lesson explains how to conduct a hypothesis test for the difference between two means. The test procedure, called the two-sample t-test, is appropriate when the following conditions are met:
      • The sampling method for each sample is simple random sampling.
      • The samples are independent.
      • Each population is at least 20 times larger than its respective sample.
      • The sampling distribution is approximately normal, which is generally the case if any of the following conditions apply.
        • The population distribution is normal.
        • The population data are symmetric, unimodal, without outliers, and the sample size is 15 or less.
        • The population data are slightly skewed, unimodal, without outliers, and the sample size is 16 to 40.
        • The sample size is greater than 40, without outliers.

2017c

Assumptions:
  • Within each sample, the values are independent, and identically normally distributed (same mean and variance).
  • The two samples are independent of each other.
  • For the usual two-sample t test, the two different samples are assumed to come from populations with the same variance, allowing for a pooled estimate of the variance. However, if the two sample variances are clearly different, a variant test, the Welch-Satterthwaite t test, is used to test whether the means are different.

2017D

2014

Introduction - There are several statistical tests that use the t-distribution and can be called a t–test. One of the most common is Student's t–test for two samples. Other t–tests include the one-sample t–test, which compares a sample mean to a theoretical mean, and the paired t–test.
Student's t–test for two samples is mathematically identical to a one-way anova with two categories; because comparing the means of two samples is such a common experimental design, and because the t–test is familiar to many more people than anova, I treat the two-sample t–test separately.
When to use it- Use the two-sample t–test when you have one nominal variable and one measurement variable, and you want to compare the mean values of the measurement variable. The nominal variable must have only two values, such as "male" and "female" or "treated" and "untreated."