๐Ÿ“Š
fundamentalsbeginner25 min

Mean vs Median: When to Use Each Measure of Central Tendency (Worked Examples)

The mean and median both describe the center of a distribution, but they give different answers on skewed data โ€” sometimes dramatically different. This guide walks through when to use each, how outliers affect them, and worked examples from income, test scores, housing prices, and clinical data.

What You'll Learn

  • โœ“Calculate the mean and median for any dataset
  • โœ“Determine which measure is appropriate based on distribution shape and outliers
  • โœ“Recognize how skewness affects the relationship between mean and median
  • โœ“Apply the correct measure to real-world scenarios (income, prices, test scores)
  • โœ“Avoid common misinterpretations of averages in reporting and analysis

1. Direct Answer: Mean vs Median in One Paragraph

The mean (arithmetic average) is the sum of all values divided by the count. It uses every data point and is mathematically convenient but sensitive to outliers. The median is the middle value when the data is sorted โ€” half of the values are below it, half above. It ignores the magnitude of outliers and describes the typical observation more robustly when the distribution is skewed. Use the mean when: (1) the data is approximately symmetric and unimodal (normal-ish), (2) you need the measure for further calculations like variance or regression, (3) you want the arithmetic center-of-mass, (4) outliers are legitimate data (not errors) and you want them to count proportionally. Use the median when: (1) the data is skewed, (2) there are outliers that aren't representative of the typical case, (3) you want the "typical" value rather than the arithmetic center, (4) the data is ordinal (ranked but not evenly spaced), (5) you're reporting to a general audience and want an intuitive measure. For symmetric data, mean and median are approximately equal. For right-skewed data (long tail on the right, like income), mean > median. For left-skewed data (long tail on the left), mean < median. This is actually a useful diagnostic โ€” comparing mean to median tells you about the distribution's shape. Classic example: US household income in 2024 โ€” median โ‰ˆ $80,000, mean โ‰ˆ $115,000. The difference ($35,000) reflects the long right tail of high earners pulling the mean up. Reporting "the average income" could mean either โ€” and the choice matters a lot for how the distribution is perceived.

Key Points

  • โ€ขMean = sum / count; Median = middle value when sorted
  • โ€ขMean sensitive to outliers; median robust to them
  • โ€ขRight-skewed: mean > median; left-skewed: mean < median
  • โ€ขSymmetric: mean โ‰ˆ median
  • โ€ขDefault to median for skewed or outlier-prone data

2. Worked Example 1: Income Data (Right-Skewed)

Consider a small town with 10 households reporting annual income: $38,000, $42,000, $45,000, $48,000, $51,000, $55,000, $60,000, $65,000, $72,000, $2,800,000 The last value is a tech founder who moved in. The others are typical middle-class households. Mean calculation: Sum = 38,000 + 42,000 + 45,000 + 48,000 + 51,000 + 55,000 + 60,000 + 65,000 + 72,000 + 2,800,000 = 3,276,000 Mean = 3,276,000 / 10 = $327,600 Median calculation: Sort the data: 38,000, 42,000, 45,000, 48,000, 51,000, 55,000, 60,000, 65,000, 72,000, 2,800,000 (already sorted) With 10 values (even count), median is average of 5th and 6th values. Median = (51,000 + 55,000) / 2 = $53,000 The mean says the "average" income is $327,600. This is technically correct but misleading โ€” nobody except the tech founder earns anywhere near that. The median of $53,000 captures what the typical household actually earns. This is why government agencies, when reporting household income distributions, almost always report the median. If you said "the average American household earns $X" using the mean, the typical household would feel the number doesn't match their experience โ€” because it doesn't. Rule of thumb: whenever you see "average income" in news reporting, ask whether it's mean or median. The difference can be $30K-$50K+ for typical distributions. When would mean be the right choice here? If you're calculating total tax revenue, total consumption, or total economic activity. The mean multiplied by population gives the total โ€” which is what governments need for budgeting. For describing the typical household, use median.

Key Points

  • โ€ขWith 10 values: median = average of 5th and 6th sorted values
  • โ€ขRight-skewed data (income): mean > median, often substantially
  • โ€ขMean includes outliers proportionally; can mislead when describing typical
  • โ€ขMedian describes the "typical" observation
  • โ€ขUse mean for totals (revenue, consumption); median for typical value

3. Worked Example 2: Test Scores (Symmetric)

A class of 20 students takes a 100-point exam: 62, 68, 71, 73, 74, 76, 78, 79, 80, 81, 82, 83, 84, 86, 87, 89, 91, 93, 95, 97 This looks roughly symmetric and bell-shaped. Mean calculation: Sum = 62 + 68 + 71 + 73 + 74 + 76 + 78 + 79 + 80 + 81 + 82 + 83 + 84 + 86 + 87 + 89 + 91 + 93 + 95 + 97 = 1629 Mean = 1629 / 20 = 81.45 Median calculation: With 20 values (even count), median is average of 10th and 11th values. Median = (81 + 82) / 2 = 81.5 Mean and median are nearly identical (81.45 vs 81.5). This is typical for symmetric distributions โ€” either measure gives approximately the same answer. For this data, you can use either measure. In practice, test score reporting often uses mean because it allows combination with standard deviation for letter-grade curves, percentile calculations, and parametric statistical tests. This example illustrates an important principle: when distributions are symmetric, the choice between mean and median doesn't matter much โ€” pick based on what calculations you need to do. When distributions are skewed, the choice matters a lot โ€” pick based on what story you want to tell about the data. How to check for symmetry: - Plot a histogram of the data - Check for symmetric bell shape - Calculate skewness (statistical measure) - Compare mean to median (if close, likely symmetric) - Look for outliers or long tails Rule of thumb: if |mean - median| / median > 0.1 (10%), consider using median instead of mean for descriptive reporting.

Key Points

  • โ€ขSymmetric data: mean โ‰ˆ median
  • โ€ขTest scores often approximately symmetric
  • โ€ขWhen |mean - median| / median < 10%, either works
  • โ€ขMean convenient for parametric tests and curves
  • โ€ขPlot histogram to check symmetry visually

4. Worked Example 3: Housing Prices (Heavily Skewed)

Home sale prices in a neighborhood over one month: $285,000, $298,000, $312,000, $325,000, $340,000, $355,000, $368,000, $380,000, $395,000, $410,000, $425,000, $440,000, $1,850,000 (a mansion) 13 sales total. Mean calculation: Sum = 285 + 298 + 312 + 325 + 340 + 355 + 368 + 380 + 395 + 410 + 425 + 440 + 1,850 = 6,183 (in thousands) Mean = 6,183 / 13 = $475,615 Median calculation: With 13 values (odd count), median is the 7th value. Median = $368,000 The mean ($475,615) is about $108,000 higher than the median ($368,000). The mansion pulled the mean up substantially. A typical house in this neighborhood sold for around $368K, but the "average" sale price is $475K. This matters for: 1. Real estate marketing: realtors must choose which number to report. "Average sale price $475K" sounds more impressive but doesn't reflect typical buyer experience. 2. Appraisals: should be median of recent comparable sales, not mean, because appraisers are valuing a typical property. 3. Tax assessments: should use median of comparable sales to avoid overvaluing typical properties based on outlier luxury sales. 4. Market trend analysis: comparing median to median across time periods is more robust than comparing mean to mean. A single luxury sale in one month can make the mean swing dramatically. Case studies show that monthly median home prices reported by public services like the NAR, FRED, and Census Bureau are more stable and more accurately represent trends than mean prices. When you see real estate news reporting "average home price," it could be either, but the most reliable sources use median. The same logic applies to: - Salary data (median more robust than mean for typical compensation) - City/state GDP per capita (median per capita more informative about typical individual) - Response times (median request time more useful than mean for user experience) - Medical outcomes (median survival time more useful than mean for typical patient outcome)

Key Points

  • โ€ขHousing prices heavily right-skewed due to luxury outliers
  • โ€ขReal estate, salary, GDP per capita: median preferred
  • โ€ขMean can swing dramatically with few outliers
  • โ€ขMedian more stable across time periods
  • โ€ขGovernment and professional sources typically report medians

5. How Skewness Reveals Itself Through Mean vs Median

The relationship between mean and median is a direct signal about the distribution's shape. Symmetric distribution: mean โ‰ˆ median Example: normal (Gaussian) distribution, centered around some value with balanced tails Right-skewed (positive skew): mean > median Example: income, house prices, startup valuations, hospital stay durations Characteristic: long tail on the right, most values clustered on the left Large positive values pull mean up while median stays centered Left-skewed (negative skew): mean < median Example: age of retirement, test scores on easy exam, age at death in developed countries Characteristic: long tail on the left, most values clustered on the right Large negative values (or low values) pull mean down while median stays centered A simple test: subtract mean from median. - If mean - median > 0 (mean > median): right-skewed - If mean - median < 0 (mean < median): left-skewed - If mean - median โ‰ˆ 0: approximately symmetric Standardized skewness measure (Pearson's): skewness = (mean - median) / standard deviation - Values > 0.5: right-skewed - Values < -0.5: left-skewed - Values between -0.5 and 0.5: approximately symmetric For statistical tests that assume symmetry (t-tests, ANOVA, regression), check skewness before applying. If heavily skewed, consider: - Data transformation (log, square root) - Non-parametric tests (Mann-Whitney, Wilcoxon) - Reporting median with IQR instead of mean with SD For descriptive reporting: - Symmetric: report mean with SD (or median with IQR) - Skewed: report median with IQR (interquartile range) - Always specify which measure you're using Common examples of right-skewed data: - Income and wealth - Reaction times - Company revenues - Request latencies - Healthcare costs - Hospital length of stay - File sizes - Text message lengths - Number of errors found in code Common examples of left-skewed data: - Age at death in developed countries - Retirement age - Time to failure of very reliable equipment - Age when students drop out - Grades on an easy exam When you see real-world data, the default assumption should be right-skewed. This is because most natural processes involving counts, durations, or amounts have lower bounds (can't go below zero) but no practical upper bound.

Key Points

  • โ€ขMean > median: right-skewed (long right tail)
  • โ€ขMean < median: left-skewed (long left tail)
  • โ€ขMean โ‰ˆ median: approximately symmetric
  • โ€ขPearson skewness = (mean - median) / SD
  • โ€ขMost real-world data is right-skewed by default

6. Other Measures and When to Use Them

Beyond mean and median, other measures of central tendency serve specific purposes: Mode: the most frequent value - Useful for categorical data (most common shoe size, most common browser) - Useful for multimodal distributions (two peaks) - Not informative for continuous data without grouping - Example: "The mode of favorite ice cream flavors in the survey is chocolate" Geometric mean: nth root of the product of n values - Use for growth rates, ratios, percentages over time - Less affected by very large values than arithmetic mean - Example: average annual growth rate over 5 years of investment returns Harmonic mean: n divided by sum of reciprocals - Use for rates where the denominator stays constant but numerator varies - Example: average speed over equal distances (not equal times) - Financial metrics: average P/E ratio across portfolio Trimmed mean: mean after removing top and bottom X% of values - Robust compromise between mean and median - Common: 10% trimmed mean (removes top 10% and bottom 10%) - Used in competitive judging (diving, gymnastics) where outlier judges are removed - Also in financial reporting to reduce outlier impact while retaining some mean properties Winsorized mean: values beyond a threshold are replaced (not removed) - Top and bottom X% replaced with the threshold values - Preserves the sample size but reduces outlier impact - Common in biostatistics Midrange: (min + max) / 2 - Simple measure but highly affected by extremes - Used in meteorology for daily temperature - Rarely appropriate for other data When to use which: Context: Categorical data (eye color, car brand) โ†’ Mode Context: Bell-curve-ish continuous data (heights, scores) โ†’ Mean Context: Skewed continuous data (income, home prices) โ†’ Median Context: Growth rates, compound returns โ†’ Geometric mean Context: Average speeds, average rates when time is not equal โ†’ Harmonic mean Context: Data with extreme outliers you want to dampen โ†’ Trimmed mean or Winsorized mean Context: Quick visual summary of extremes โ†’ Midrange For basic reporting, mean and median cover 95% of cases. The other measures fill specific niches. Know they exist; consult when your data has specific characteristics that call for them.

Key Points

  • โ€ขMode: most frequent value; useful for categorical data
  • โ€ขGeometric mean: for growth rates and ratios
  • โ€ขHarmonic mean: for rates with fixed denominators
  • โ€ขTrimmed/Winsorized mean: compromise between mean and median
  • โ€ขMatch the measure to the data structure and your question

7. Common Misinterpretations and Pitfalls

Pitfall 1: Reporting "average" without specifying mean or median. Most people hear "average" and think mean, but median is sometimes reported the same way. Always specify, and be explicit in your own writing: "median household income," "mean test score." Pitfall 2: Using mean for skewed data in descriptive reporting. "Average customer spends $X" with mean on right-skewed data misrepresents typical customer. Use median instead, or report both. Pitfall 3: Inferring typical behavior from mean alone. "The average response time was 250ms" sounds good, but if 5% of requests take 10 seconds, the mean might be 250ms while most users experience 150ms and some experience terrible performance. Report median plus percentiles (p50, p95, p99) for latency. Pitfall 4: Comparing means across differently-sized groups without normalization. "Group A had higher average than Group B" may reflect different sample sizes or compositions rather than group differences. Consider distributions, not just centers. Pitfall 5: Assuming the mean represents anyone specifically. "The average person in this country is 5'6" and weighs 160 lbs" โ€” this is likely describing nobody actually. Individual people are not averages; populations have averages. Pitfall 6: Using mean in small samples with outliers. With n = 10 and one outlier, mean can be wildly unrepresentative. Use median or report both. Pitfall 7: Summary statistics without visualization. Two datasets can have the same mean and median but wildly different distributions (one concentrated near the mean, one bimodal, one with high variance). Always look at histograms or scatter plots alongside summary stats. Pitfall 8: Anchoring on mean when median is more appropriate. If you're buying a house and someone tells you the "average price" in the neighborhood, that could be misleading. Ask for the median, or for a price distribution. Pitfall 9: Using mean for ordinal data. Likert scale responses (Strongly Disagree to Strongly Agree, 1-5) are technically ordinal. Computing a mean assumes the distance from 1 to 2 equals the distance from 4 to 5 โ€” not necessarily true. Median is often safer, though mean is widely used in practice for Likert data. Pitfall 10: Ignoring distribution shape. Mean and median together tell you about distribution shape. If you only report one, you miss information. In serious analysis, report both, along with standard deviation, quartiles, and ideally a visualization. Good practice: when reporting central tendency: - State which measure you're using - Report alongside a measure of spread (SD, IQR, range) - Include a sample size - Consider reporting percentiles (25th, 50th, 75th, 95th, 99th) - If non-technical audience, include a one-sentence explanation - For technical audience, include a visualization (histogram, boxplot)

Key Points

  • โ€ขAlways specify "mean" or "median" rather than "average"
  • โ€ขReport percentiles for latency and user-facing data
  • โ€ขDon't infer individual behavior from population averages
  • โ€ขAlways visualize alongside summary statistics
  • โ€ขReport both mean and median for skewed data

Key Takeaways

  • โ˜…Mean = sum / count; Median = middle value when sorted
  • โ˜…Mean sensitive to outliers; median robust
  • โ˜…Right-skewed: mean > median; left-skewed: mean < median
  • โ˜…Symmetric: mean โ‰ˆ median
  • โ˜…Pearson skewness = (mean - median) / SD
  • โ˜…Use median for income, home prices, response times
  • โ˜…Use mean for totals, further parametric calculations
  • โ˜…Report both mean and median for skewed data
  • โ˜…Always include measure of spread (SD or IQR) with center
  • โ˜…Visualize distribution alongside summary statistics

Practice Questions

1. A class of 15 students has test scores: 58, 62, 65, 68, 70, 72, 73, 75, 78, 80, 82, 84, 85, 88, 90. Calculate mean and median.
Sum = 58+62+65+68+70+72+73+75+78+80+82+84+85+88+90 = 1130. Mean = 1130/15 = 75.33. Median (odd count, middle 8th value) = 75. Close to mean, suggesting approximate symmetry.
2. Startup salaries at a tech company (in thousands): 85, 92, 95, 98, 102, 105, 110, 115, 120, 780 (CEO). Calculate mean and median. Which is more representative?
Sum = 1,702. Mean = 170.2. Median (even count, average of 5th and 6th) = (102 + 105)/2 = 103.5. Median ($103.5K) is more representative of typical employee salary; mean ($170K) is skewed upward by CEO outlier.
3. Housing sale prices in a neighborhood over one month: $320K, $335K, $348K, $362K, $375K, $390K, $410K (no outliers). Skewness?
Sum = 2,540K. Mean = 362.86K. Median (odd count, 4th value) = 362K. Mean โ‰ˆ median, so approximately symmetric distribution. Either measure is appropriate for description. Use mean for parametric tests; use median or report both for non-technical audiences.
4. Customer service response times (in minutes): 1, 2, 2, 3, 3, 3, 4, 4, 5, 6, 8, 12, 25, 90, 180. What's the typical response time?
The data is heavily right-skewed. Mean = 23.07 minutes. Median = 4 minutes. The typical customer waits about 4 minutes, but a small number of slow responses pull the mean to 23 minutes. Report median (4 min) as "typical response time" and include percentiles (e.g., 95th percentile = 90 min) to capture worst-case experience.
5. When should you specifically use geometric mean instead of arithmetic mean?
Use geometric mean for multiplicative quantities: growth rates (investment returns, population growth), ratios (P/E ratios), and percentages over time. Example: 5-year investment returns of 10%, -5%, 15%, 8%, 3%. Arithmetic mean = 6.2%, geometric mean = (1.10 ร— 0.95 ร— 1.15 ร— 1.08 ร— 1.03)^(1/5) - 1 โ‰ˆ 6.1% โ€” close but geometrically correct for compound growth.

Study with AI

Get personalized help and instant answers anytime.

Download StatsIQ

FAQs

Common questions about this topic

In everyday language, "average" usually means the arithmetic mean โ€” the sum divided by count. In statistics, "average" can refer to any measure of central tendency (mean, median, mode). To be precise, always specify which measure you're using. Journalists and general writers often say "average" when they should say "median." Always ask or check.

Use median when: (1) the data is skewed (long tail on one side); (2) there are outliers that aren't typical; (3) you want the "typical" case; (4) the data is ordinal (ranked but not on an evenly-spaced scale); (5) you're reporting to a general audience where intuition matters; (6) you want a robust measure resistant to extreme values.

Yes, for perfectly symmetric distributions. In practice, with real data, mean and median are rarely exactly equal but close for approximately symmetric distributions. If mean = median exactly, the distribution is symmetric. If they differ, the distribution is skewed in the direction of the mean (mean > median means right-skewed).

Both change by the same constant. If you add $5,000 to every salary in your dataset, both mean and median increase by $5,000. The relative relationship between mean and median stays the same. Linear transformations (adding/subtracting constants, multiplying by constants) preserve the relationship between mean and median.

Median is the middle value (by position in sorted order). Mode is the most frequent value. For continuous data, the mode may not be meaningful (each value often appears once). For categorical data (eye color, car brand), mode is the primary measure. For numerical data, median is usually more useful than mode. Distribution may have multiple modes (bimodal, multimodal).

Yes. Describe your data (what it measures, expected shape, any outliers) or upload a dataset and StatsIQ analyzes the distribution, calculates both mean and median, assesses skewness, identifies outliers, and recommends which measure is most appropriate for your reporting context. Also generates visualizations (histograms, boxplots) that make the distribution shape obvious. This content is for educational purposes only and does not constitute statistical advice.

More Study Guides