📐

fundamentalsintermediate25-30 min

Chi-Square Tests Explained: Goodness of Fit and Test of Independence

Q: What is the difference between chi-square and a z-test for proportions?

For comparing two proportions (a 2×2 table), they are mathematically equivalent — the chi-square statistic equals the square of the z-statistic. Chi-square generalizes to larger tables (3+ categories or 3+ groups) where a single z-test cannot be applied. Use chi-square when you have more than two groups or more than two categories.

Q: Can StatsIQ help me practice chi-square problems?

Yes. StatsIQ generates chi-square problems for both goodness of fit and independence tests, including calculating expected counts, finding the test statistic, determining degrees of freedom, and interpreting results in context.

A clear walkthrough of both chi-square tests — goodness of fit (does a distribution match what we expected?) and test of independence (are two categorical variables related?) — with worked examples, assumptions, and interpretation guidelines.

What You'll Learn

✓Explain when to use a chi-square goodness of fit test vs. a test of independence
✓Calculate the chi-square test statistic from observed and expected frequencies
✓Determine degrees of freedom and interpret the result using a chi-square table or software
✓Check the assumptions required for valid chi-square inference

1. When to Use Chi-Square Tests

Chi-square tests are your go-to tool whenever you are working with categorical data — data that falls into distinct categories rather than continuous measurements. You cannot run a t-test on whether people prefer chocolate, vanilla, or strawberry. But you can run a chi-square test. There are two main flavors. The goodness of fit test asks: does the distribution of a single categorical variable match what we expected? For example, if a die is fair, we expect each face to come up roughly 1/6 of the time. The goodness of fit test checks whether your observed results are close enough to that expectation or suspiciously far off. The test of independence asks: are two categorical variables related, or are they independent? For example, is there a relationship between gender and political party preference? The test of independence examines whether the pattern in your two-way table could have occurred by chance if the variables were truly unrelated. Both tests use the same core formula but set up the expected values differently. The key decision: one variable means goodness of fit, two variables means test of independence.

Key Points

•Chi-square tests work with categorical (count) data, not continuous measurements
•Goodness of fit: one categorical variable tested against an expected distribution
•Test of independence: two categorical variables tested for association in a contingency table
•Both use the same chi-square statistic formula but differ in how expected values are calculated

2. The Chi-Square Statistic: How It Works

The chi-square statistic measures the overall discrepancy between what you observed and what you would expect under the null hypothesis. The formula is: X² = Σ (O - E)² / E, where O is each observed count and E is each expected count. You calculate (O - E)² / E for every cell in your table and sum them all up. The logic is intuitive once you see it. If observed counts are close to expected counts, each (O - E)² / E term is small and the total X² is small — consistent with the null hypothesis. If observed counts are far from expected, the terms get large and X² gets large — evidence against the null. Dividing by E is important because it standardizes the contribution of each cell. A difference of 10 between observed and expected matters much more if the expected count is 20 (that is a 50% discrepancy) than if the expected count is 500 (only a 2% discrepancy). Without dividing by E, cells with large expected counts would dominate the statistic even if their proportional discrepancy was tiny. The chi-square statistic is always non-negative (because you square the differences), and it follows a chi-square distribution under the null hypothesis. Larger values provide stronger evidence against the null.

Key Points

•X² = Σ (O - E)² / E — sum over all categories or cells
•Small X² means observed data is close to expected (consistent with null hypothesis)
•Dividing by E standardizes each cell's contribution so large cells do not automatically dominate
•The chi-square statistic is always zero or positive and follows a chi-square distribution under H0

3. Goodness of Fit: Worked Example

Suppose you roll a die 300 times and want to test whether it is fair. Under the null hypothesis (fair die), you expect each face to appear 300/6 = 50 times. Your observed counts are: 1 = 42, 2 = 55, 3 = 48, 4 = 63, 5 = 47, 6 = 45. Calculate X²: (42-50)²/50 + (55-50)²/50 + (48-50)²/50 + (63-50)²/50 + (47-50)²/50 + (45-50)²/50 = 64/50 + 25/50 + 4/50 + 169/50 + 9/50 + 25/50 = 1.28 + 0.50 + 0.08 + 3.38 + 0.18 + 0.50 = 5.92. Degrees of freedom = number of categories minus 1 = 6 - 1 = 5. Looking up X² = 5.92 with df = 5 in a chi-square table, the critical value at alpha = 0.05 is 11.07. Since 5.92 < 11.07, we fail to reject the null. The die is consistent with being fair — the deviations we observed could reasonably occur by chance. Notice that the face showing 4 contributed the most to X² (3.38 out of 5.92). If we had observed even more 4s, that single cell could push the total past the critical value. Chi-square tests are sensitive to where the deviation occurs, not just the overall magnitude.

Key Points

•Expected counts for goodness of fit come from the hypothesized distribution (e.g., 1/6 for a fair die)
•Degrees of freedom = number of categories - 1
•Compare X² to the critical value from the chi-square table at your chosen alpha level
•Individual cell contributions to X² tell you which categories are driving any overall deviation

4. Test of Independence: Worked Example

Now suppose you survey 200 students and record both their year (freshman, sophomore) and whether they prefer online or in-person classes. The contingency table is: | | Online | In-Person | Total | |---|---|---|---| | Freshman | 45 | 55 | 100 | | Sophomore | 65 | 35 | 100 | | Total | 110 | 90 | 200 | The null hypothesis is that class year and format preference are independent. Expected counts are calculated as: E = (row total × column total) / grand total. For Freshman-Online: (100 × 110) / 200 = 55. For Freshman-In-Person: (100 × 90) / 200 = 45. Sophomore-Online: 55. Sophomore-In-Person: 45. X² = (45-55)²/55 + (55-45)²/45 + (65-55)²/55 + (35-45)²/45 = 100/55 + 100/45 + 100/55 + 100/45 = 1.82 + 2.22 + 1.82 + 2.22 = 8.08. Degrees of freedom = (rows - 1) × (columns - 1) = (2-1)(2-1) = 1. The critical value at alpha = 0.05 with df = 1 is 3.84. Since 8.08 > 3.84, we reject the null hypothesis. There is statistically significant evidence that class year and format preference are related — sophomores appear to prefer online classes more than freshmen do. StatsIQ generates practice problems with contingency tables of different sizes so you can build fluency with the expected count formula and the degrees of freedom calculation.

Key Points

•Expected counts for independence: E = (row total × column total) / grand total
•Degrees of freedom = (number of rows - 1) × (number of columns - 1)
•Rejecting the null means the two variables are associated — but the test does not tell you the direction or strength, only that independence is unlikely
•For a 2×2 table, the chi-square test of independence is equivalent to comparing two proportions

5. Assumptions and When Chi-Square Fails

Chi-square tests require several conditions to produce valid results. The most important is the expected count condition: every expected cell count should be at least 5. When expected counts are too small, the chi-square distribution is a poor approximation and the test becomes unreliable. If you have cells with expected counts below 5, you can either combine categories (merge small groups) or use Fisher's exact test (for 2×2 tables) which does not rely on the chi-square approximation. Second, the observations must be independent. Each subject or observation should contribute to exactly one cell in the table. If the same person appears in multiple categories (repeated measures), the chi-square test is not appropriate. Third, the data must be counts (frequencies), not percentages or proportions. You cannot run a chi-square test on a table of percentages — you need the raw counts. This seems obvious but is a common mistake when students work from summary tables in published papers. Finally, chi-square tests only detect association — they do not measure its strength or direction. A significant result tells you the variables are probably not independent, but not how strong the relationship is. For that, you need additional measures like Cramer's V (which ranges from 0 to 1 and quantifies effect size) or examining the standardized residuals to see which cells are driving the departure from independence.

Key Points

•All expected cell counts must be at least 5 for the chi-square approximation to be valid
•Observations must be independent — each subject contributes to exactly one cell
•Data must be raw counts, not percentages or proportions
•Chi-square detects association but not strength or direction — use Cramer's V for effect size

Key Takeaways

★X² = Σ (O - E)² / E — the fundamental chi-square formula for both test types
★Goodness of fit df = categories - 1; independence df = (rows - 1)(columns - 1)
★All expected counts must be at least 5 for valid inference
★Chi-square is always a right-tailed test — only large values provide evidence against H0
★For 2×2 tables with small samples, use Fisher's exact test instead of chi-square

Practice Questions

1. A candy bag claims 25% red, 25% blue, 25% green, 25% yellow. You count 40 red, 30 blue, 35 green, 45 yellow out of 150 candies. Calculate X² and determine if the distribution matches the claim at alpha = 0.05.

Expected: 150 × 0.25 = 37.5 for each color. X² = (40-37.5)²/37.5 + (30-37.5)²/37.5 + (35-37.5)²/37.5 + (45-37.5)²/37.5 = 0.167 + 1.500 + 0.167 + 1.500 = 3.333. With df = 3, the critical value at 0.05 is 7.815. Since 3.333 < 7.815, fail to reject — the distribution is consistent with the claimed proportions.

2. In a 3×2 contingency table with 300 observations, what are the degrees of freedom for a test of independence?

df = (3-1)(2-1) = 2 × 1 = 2.

Study with AI

Get personalized help and instant answers anytime.

Download StatsIQ

FAQs

Common questions about this topic

For comparing two proportions (a 2×2 table), they are mathematically equivalent — the chi-square statistic equals the square of the z-statistic. Chi-square generalizes to larger tables (3+ categories or 3+ groups) where a single z-test cannot be applied. Use chi-square when you have more than two groups or more than two categories.

Yes. StatsIQ generates chi-square problems for both goodness of fit and independence tests, including calculating expected counts, finding the test statistic, determining degrees of freedom, and interpreting results in context.

Related Study Guides

🔬 fundamentals

Browse All Study Guides

🎯 AP Statistics 🔬 Introduction to 📈 Regression Analysis 🎲 Probability Foundations 📊 Understanding Statistical 🧪 ANOVA and 📉 Data Visualization 🔄 Bayesian vs 📊 What Is 📐 What Is 🔗 Correlation vs 📐 Central Limit 📏 Confidence Intervals:📐 P-Values and 📐 Chi-Square Tests ⚠️ Type I 🎲 Sampling Methods 📈 Introduction to 📏 Effect Size 📉 Multiple Regression:🔀 Non-Parametric Tests:🎯 How to 🧪 A/B Testing 🧹 Data Cleaning ⏱️ Survival Analysis:🔗 Introduction to 📈 Time Series 🔬 Principal Component 🔀 How to 📐 Two-Sample t-Test 📊 How to 🔀 Paired vs 📋 How to 📊 Z-Scores and 📈 R Squared 🎲 Binomial Probability 🎲 Expected Value 📐 Standard Error 🎯 Margin of 📊 Contingency Tables 📉 Poisson Distribution:📏 Cohen's d 🔗 Pearson vs ⚖️ One-Tailed vs 🔔 Normal Distribution 📉 Linear Regression 📊 Mean vs 🎯 Confidence vs 📊 Two-Way ANOVA:⚡ Statistical Power 🎯 Conditional Probability 🎲 Permutations vs 📈 Log Transformations 🔄 Simpson's Paradox:🧪 Hypothesis Testing:🎲 Probability Distributions:📈 Central Limit ⚖️ Type I 🎯 P-Value Interpretation:↔️ One-Tailed vs 🎲 Binomial vs 📊 Normal Distribution 📈 Discrete vs 📊 Chi-Square Goodness-of-Fit 🔬 Mann-Whitney U ⏱️ Exponential Distribution:🎯 Geometric vs 🎯 Wilcoxon Signed-Rank 🎯 Kruskal-Wallis Test 🎯 Tukey HSD 🎯 Relative Risk 🔁 Friedman Test 📈 Spearman vs 🎚️ Bonferroni vs 🎯 Confidence vs ⚡ A-Priori vs

Chi-Square Tests Explained: Goodness of Fit and Test of Independence

What You'll Learn

1. When to Use Chi-Square Tests

Key Points

2. The Chi-Square Statistic: How It Works

Key Points

3. Goodness of Fit: Worked Example

Key Points

4. Test of Independence: Worked Example

Key Points

5. Assumptions and When Chi-Square Fails

Key Points

Key Takeaways

Practice Questions

Study with AI

FAQs

What is the difference between chi-square and a z-test for proportions?

Can StatsIQ help me practice chi-square problems?

Related Study Guides

Introduction to Hypothesis Testing

P-Values and Statistical Significance: What They Actually Mean

Confidence Intervals: What They Mean, How to Calculate Them, and What They Do NOT Tell You

Browse All Study Guides