🔢inference

Chi-Square Statistic

Q: How do I calculate expected frequencies for a test of independence?

For each cell in a contingency table, the expected frequency is E = (row total x column total) / grand total. This formula assumes the two variables are independent, which is the null hypothesis being tested.

Q: What are the assumptions of the chi-square test?

The data must consist of independent observations, every observation must fall into exactly one category, and the expected frequency in each cell should be at least 5. When expected frequencies are too small, consider combining categories or using Fisher's exact test.

χ² = Σ(O - E)² / E

The chi-square statistic measures the discrepancy between observed and expected frequencies. It is used in goodness-of-fit tests (does data follow a hypothesized distribution?) and tests of independence (are two categorical variables related?). Larger values of χ² indicate greater deviation from what was expected.

Variables

χ²=Chi-Square Statistic

The test statistic measuring overall discrepancy between observed and expected counts

O=Observed Frequency

The actual count in each category from the data

E=Expected Frequency

The count expected under the null hypothesis

Example Calculation

Scenario

A die is rolled 60 times. The observed frequencies for faces 1 through 6 are: 8, 12, 10, 14, 7, 9. Test if the die is fair.

Given Data

O:8, 12, 10, 14, 7, 9

E:10 for each face (60/6 = 10)

Calculation

χ² = (8-10)²/10 + (12-10)²/10 + (10-10)²/10 + (14-10)²/10 + (7-10)²/10 + (9-10)²/10 = 0.4 + 0.4 + 0 + 1.6 + 0.9 + 0.1

Result

χ² = 3.4 with df = 5

Interpretation

With χ² = 3.4 and 5 degrees of freedom, the p-value is approximately 0.64. Since p > 0.05, we fail to reject the null hypothesis. There is no significant evidence that the die is unfair.

When to Use This Formula

✓Testing whether observed categorical data fits an expected distribution (goodness-of-fit)
✓Testing whether two categorical variables are independent (test of independence)
✓Analyzing survey responses across categories
✓Comparing proportions across multiple groups (test of homogeneity)

Common Mistakes

✗Using raw proportions or percentages instead of counts in the formula
✗Applying the chi-square test when expected frequencies are too small (generally E < 5)
✗Confusing degrees of freedom for goodness-of-fit (k - 1) versus independence ((r-1)(c-1))
✗Interpreting a large chi-square as the direction of the association without examining the residuals

Calculate This Formula Instantly

Snap a photo of any problem and get step-by-step solutions.

Download StatsIQ

FAQs

Common questions about this formula

For each cell in a contingency table, the expected frequency is E = (row total x column total) / grand total. This formula assumes the two variables are independent, which is the null hypothesis being tested.

The data must consist of independent observations, every observation must fall into exactly one category, and the expected frequency in each cell should be at least 5. When expected frequencies are too small, consider combining categories or using Fisher's exact test.

Related Formulas

📋 inference

F-Statistic Formula for ANOVA: F = MSB / MSW (Worked Examples)

🔄 probability

Bayes' Theorem

Browse All Formulas

📊 Sample Mean 📏 Sample Standard Deviation 🎯 Z-Score 🔒 Confidence Interval for the Mean 🧪 One-Sample T-Test Statistic 🔢 Chi-Square Statistic 🔗 Pearson Correlation 📈 Linear Regression Slope 🎲 Binomial Probability 🔔 Normal Distribution PDF 🔄 Bayes' Theorem 📐 Margin of Error 📋 F-Statistic Formula for ANOVA: F = MSB / MSW (Worked Examples)🎯 R-Squared (Coefficient of Determination)🎰 Poisson Probability