Introduction to Hypothesis Testing
A beginner-friendly guide to the logic and mechanics of hypothesis testing. Learn how to formulate hypotheses, calculate test statistics, interpret p-values, and draw conclusions.
What You'll Learn
- โUnderstand the logical framework behind hypothesis testing.
- โLearn to set up null and alternative hypotheses correctly.
- โInterpret p-values and make proper conclusions about statistical significance.
1. The Logic of Hypothesis Testing
Hypothesis testing uses a proof-by-contradiction approach. We assume the null hypothesis is true and then evaluate whether the observed data are consistent with that assumption. If the data are very unlikely under the null, we reject it.
Key Points
- โขThe null hypothesis (H0) represents no effect or no difference; the alternative (Ha) represents the claim being tested.
- โขWe never prove the alternative; we only find enough evidence to reject the null or fail to reject it.
- โขFailing to reject the null does not mean the null is true; it means we lack sufficient evidence against it.
2. P-values and Significance
The p-value quantifies how surprising the observed data would be if the null hypothesis were true. A small p-value suggests the data are inconsistent with the null. The significance level alpha is the threshold for making a decision.
Key Points
- โขThe p-value is the probability of obtaining results as extreme as observed, assuming H0 is true.
- โขIf p-value <= alpha, reject H0; if p-value > alpha, fail to reject H0.
- โขThe most common significance level is alpha = 0.05, but this is a convention, not a universal rule.
3. Drawing Conclusions
A proper conclusion states the decision in the context of the problem. Statistical significance does not always imply practical significance, and the size of the effect matters as much as whether it is statistically detectable.
Key Points
- โขAlways state conclusions in the context of the original research question, not just in terms of rejecting or failing to reject.
- โขStatistical significance depends on sample size; a trivially small effect can be significant with a huge sample.
- โขReport effect sizes and confidence intervals alongside p-values for a fuller picture.
Key Takeaways
- โ A p-value of 0.03 means there is a 3% chance of observing data this extreme if the null hypothesis is true.
- โ Rejecting the null at alpha = 0.05 does not mean there is only a 5% chance the null is true.
- โ The power of a test is the probability of correctly rejecting a false null hypothesis.
- โ Increasing sample size increases power without changing the significance level.
Practice Questions
1. A test yields a p-value of 0.08. At alpha = 0.05, what is the conclusion?
2. What is the difference between statistical significance and practical significance?
FAQs
Common questions about this topic
No. The p-value is the probability of the observed data (or more extreme) given that the null hypothesis is true. It is not the probability that the null hypothesis is true or false. Bayesian methods are needed to make probability statements about hypotheses.
The 0.05 threshold is a convention popularized by Ronald Fisher. It represents a 1-in-20 chance of a false positive, which was deemed a reasonable balance between being too strict and too lenient. Different fields may use stricter thresholds (e.g., 0.01 in physics).