🔄

Hypothesis Testingintermediate

Paired T-Test (Before and After Comparison)

Q: When should I use a paired t-test vs. an independent two-sample t-test?

Use a paired t-test when each observation in one group can be naturally paired with an observation in the other group (same subject before/after, matched pairs, twins, etc.). Use an independent two-sample t-test when the two groups contain different, unrelated subjects.

Q: What if the differences are not normally distributed?

For small samples with non-normal differences, use the Wilcoxon signed-rank test as a nonparametric alternative. For larger samples (n >= 30), the Central Limit Theorem ensures the sampling distribution of d-bar is approximately normal, making the paired t-test robust to moderate departures from normality.

Conduct a paired t-test to determine whether a training program significantly improves employee productivity scores in a before-and-after study.

Problem Scenario

A company implements a new training program and measures the productivity scores of 8 employees before and after the training. Before: 45, 52, 48, 55, 40, 50, 47, 53. After: 50, 58, 52, 60, 46, 55, 51, 59. At alpha = 0.05, test whether the training significantly increased productivity scores.

Given Data

Before scores45, 52, 48, 55, 40, 50, 47, 53

After scores50, 58, 52, 60, 46, 55, 51, 59

Number of pairs (n)8

Significance level (alpha)0.05

Test directionOne-tailed (improvement expected)

Requirements

Calculate the differences (After - Before) for each pair
Compute the mean and standard deviation of the differences and the t-statistic
Determine the p-value and draw a conclusion

Solution

Step 1:

Calculate the differences d_i = After - Before for each employee: 5, 6, 4, 5, 6, 5, 4, 6.

Step 2:

Calculate the mean of differences: d-bar = (5 + 6 + 4 + 5 + 6 + 5 + 4 + 6) / 8 = 41 / 8 = 5.125.

Step 3:

Calculate the standard deviation of differences: Deviations from d-bar: -0.125, 0.875, -1.125, -0.125, 0.875, -0.125, -1.125, 0.875. Squared deviations: 0.01563, 0.76563, 1.26563, 0.01563, 0.76563, 0.01563, 1.26563, 0.76563. Sum = 4.875. s_d = sqrt(4.875 / 7) = sqrt(0.6964) = 0.8345.

Step 4:

Calculate the t-statistic: t = d-bar / (s_d / sqrt(n)) = 5.125 / (0.8345 / sqrt(8)) = 5.125 / (0.8345 / 2.8284) = 5.125 / 0.2951 = 17.37.

Step 5:

State hypotheses and find p-value. H_0: mu_d <= 0 (no improvement or decrease). H_a: mu_d > 0 (improvement). With t = 17.37 and df = n - 1 = 7, the p-value < 0.0001 (critical value t(0.05, 7) = 1.895). Since 17.37 >> 1.895, we reject H_0.

Step 6:

Conclusion: At the 0.05 significance level, there is overwhelming evidence that the training program significantly increased productivity scores. The average improvement was 5.125 points.

Final Answer

t = 17.37, df = 7, p-value < 0.0001. We reject H_0. The training program produced a statistically significant increase in productivity scores, with an average improvement of 5.125 points per employee.

Key Takeaways

✓A paired t-test is used when observations come in natural pairs (e.g., before/after on the same subject). It controls for individual differences by analyzing the within-pair differences.
✓The paired t-test is more powerful than the independent two-sample t-test when subjects vary widely, because it removes between-subject variability.
✓Always check whether the differences are approximately normally distributed, especially with small sample sizes. A histogram or normal probability plot of the differences can help.

Common Errors to Avoid

✗Using an independent two-sample t-test instead of a paired t-test when the data are matched. This ignores the pairing structure and can lead to incorrect conclusions.
✗Subtracting in the wrong direction and getting confused about the sign of the t-statistic. Define differences consistently (e.g., After - Before) and match the hypothesis direction.
✗Forgetting that the degrees of freedom for a paired t-test are n - 1 (number of pairs minus one), not 2n - 2.

Practice More Problems with AI

Snap a photo of any problem and get instant explanations.

Download StatsIQ

FAQs

Common questions about this problem type