🎯
advancedintermediate30 min

Confidence vs Prediction vs Tolerance Intervals: Which to Use and Worked Examples

Three different intervals quantify different types of uncertainty: confidence intervals for parameter estimation, prediction intervals for single future observations, and tolerance intervals for proportions of the population. This guide walks through each with worked examples and shows when to choose which.

What You'll Learn

  • Distinguish confidence, prediction, and tolerance intervals by purpose
  • Calculate each interval for a normal distribution
  • Choose the appropriate interval type for different statistical questions
  • Recognize when to use each in research, quality control, and forecasting
  • Avoid common misinterpretations of these intervals

1. Direct Answer: Three Intervals, Three Questions

Three types of intervals quantify different kinds of statistical uncertainty: 1. Confidence interval (CI): quantifies uncertainty about a POPULATION PARAMETER (like the mean). It tells you: "We are 95% confident the true population mean is between X and Y." 2. Prediction interval (PI): quantifies uncertainty about a SINGLE FUTURE OBSERVATION. It tells you: "We are 95% confident that the next individual observation will fall between X and Y." 3. Tolerance interval (TI): quantifies a PROPORTION OF THE POPULATION. It tells you: "We are 95% confident that 95% of the population falls between X and Y." All three are intervals with a lower and upper bound. They all typically use a confidence level (often 95%). But they answer fundamentally different questions and have different widths. Width comparison (from narrowest to widest): - Confidence interval: narrowest; uncertainty is only about the estimated parameter - Prediction interval: wider; includes both parameter uncertainty AND individual observation variability - Tolerance interval: widest; includes parameter uncertainty AND captures a proportion of the population When to use each: Use CI when: estimating a parameter (mean, proportion, regression coefficient). Common in research, A/B tests, meta-analyses. Use PI when: predicting individual future values. Common in forecasting, quality control for single units, weather prediction. Use TI when: setting specifications for a proportion of the population. Common in manufacturing quality standards, regulatory compliance, capability studies. The key insight: don't use a confidence interval when you need a prediction interval. A 95% CI on the mean doesn't tell you where 95% of individual values fall — only where the true mean probably is. These are very different things.

Key Points

  • CI: uncertainty about a population parameter
  • PI: uncertainty about a single future observation
  • TI: proportion of the population
  • CI narrowest; PI wider; TI widest
  • Choose based on what question you're answering

2. Worked Example 1: Confidence Interval for the Mean

Context: A lab measures iron concentration in 25 water samples and wants to estimate the true average iron concentration in the water system. Data summary: Sample size n = 25 Sample mean x̄ = 15.2 mg/L Sample standard deviation s = 3.1 mg/L 95% confidence interval for the mean: CI = x̄ ± t_{0.025, n-1} × (s / √n) Step 1 — Find critical t-value. With df = n - 1 = 24 and two-tailed α = 0.05 (so α/2 = 0.025 in each tail), t_{0.025, 24} = 2.064. Step 2 — Calculate standard error: SE = s / √n = 3.1 / √25 = 3.1 / 5 = 0.62 Step 3 — Calculate margin of error: ME = t × SE = 2.064 × 0.62 = 1.28 Step 4 — Calculate bounds: Lower bound = 15.2 - 1.28 = 13.92 Upper bound = 15.2 + 1.28 = 16.48 95% CI: (13.92, 16.48) mg/L Interpretation: "We are 95% confident that the true mean iron concentration in this water system is between 13.92 and 16.48 mg/L." Technical note: "95% confident" doesn't mean 95% probability the interval contains the true mean. It means: if we repeated this sampling many times and computed intervals each time, 95% of those intervals would contain the true mean. This CI tells you nothing about individual samples. It only tells you about the TRUE POPULATION MEAN. Why the width of this CI doesn't say where samples fall: individual sample variability (with SD = 3.1 mg/L) is ignored. A 95% CI could be as narrow as ±0.1 with a huge sample size, but individual samples would still vary by ±SD = ±3.1. The CI only addresses estimation uncertainty, not sample variability. If you want to know where individual future samples will fall, you need a prediction interval, not a confidence interval.

Key Points

  • CI formula: x̄ ± t × (s / √n)
  • t-value from Student t-distribution with n-1 df
  • Interpretation about population mean, not individual values
  • Width decreases with sample size (goes to 0 as n → ∞)
  • Doesn't tell you where samples fall

3. Worked Example 2: Prediction Interval for a Single Observation

Same data context: 25 iron concentration samples, x̄ = 15.2 mg/L, s = 3.1 mg/L. Question: "Where would we expect the NEXT individual water sample to fall with 95% confidence?" 95% prediction interval for a single future observation: PI = x̄ ± t_{0.025, n-1} × s × √(1 + 1/n) Notice the √(1 + 1/n) term instead of just √(1/n). This accounts for both: - Uncertainty about the true mean (the 1/n part) - Individual observation variability (the 1 part) Step 1 — t-value: t_{0.025, 24} = 2.064 (same as before). Step 2 — Multiplier: √(1 + 1/25) = √(1.04) = 1.0198 Step 3 — Margin of error: ME = 2.064 × 3.1 × 1.0198 = 6.53 Step 4 — Bounds: Lower = 15.2 - 6.53 = 8.67 Upper = 15.2 + 6.53 = 21.73 95% PI: (8.67, 21.73) mg/L Interpretation: "We are 95% confident that the next individual water sample will have iron concentration between 8.67 and 21.73 mg/L." Compare to the CI above: the CI is (13.92, 16.48), width 2.56. The PI is (8.67, 21.73), width 13.06. The PI is 5× wider. Why is the PI so much wider? Because individual samples vary by about ±3.1 mg/L (the standard deviation), while the sample mean varies by only ±0.62 mg/L (the standard error). A single future observation could be anywhere within the variability of individual samples, not just within the uncertainty of the mean estimate. As sample size grows very large (n → ∞): - CI shrinks to zero width (you know the mean perfectly) - PI shrinks to about ±1.96σ (the normal approximation for a single observation) - CI never gets as wide as PI even with small samples Key insight: you cannot use a CI to predict where individual observations will fall. The CI only tells you about the estimated parameter. When PI is most useful: - Weather forecasting (predicting temperature tomorrow) - Quality control (predicting measurement on next unit) - Clinical prediction (predicting individual patient outcome) - Forecasting (predicting individual future values, not the average)

Key Points

  • PI formula: x̄ ± t × s × √(1 + 1/n)
  • Extra term accounts for individual observation variability
  • Width typically 4-5× the CI width
  • As n increases, PI approaches ±1.96σ (for 95% normal)
  • Use for predicting single future observations

4. Worked Example 3: Tolerance Interval for Population Proportion

Same data context: 25 iron samples, x̄ = 15.2 mg/L, s = 3.1 mg/L. Question: "What range will contain 95% of all water samples (not just one future sample, but 95% of the population)?" A 95%/95% tolerance interval: we want to be 95% confident that the interval contains 95% of the population. Tolerance interval formula (for normal distribution): TI = x̄ ± k × s Where k is a factor from tolerance interval tables. For 95%/95% two-sided TI with n = 25, k ≈ 2.45 (from standard tolerance factor tables; exact value depends on the table and interpolation). Step 1 — k-factor from table: k_{0.95, 0.95, 25} ≈ 2.45 Step 2 — Calculate bounds: Lower = 15.2 - 2.45 × 3.1 = 15.2 - 7.60 = 7.60 Upper = 15.2 + 2.45 × 3.1 = 15.2 + 7.60 = 22.80 95%/95% TI: (7.60, 22.80) mg/L Interpretation: "We are 95% confident that 95% of water samples fall between 7.60 and 22.80 mg/L." This is the widest of the three intervals: - CI: (13.92, 16.48), width 2.56 - PI: (8.67, 21.73), width 13.06 - TI: (7.60, 22.80), width 15.20 Why is the TI widest? Because it must cover 95% of the entire population, not just one sample. The interval must be wide enough to capture the variability of most individual samples, while also accounting for our uncertainty about the true mean and variance. As sample size increases, the TI approaches the true population interval (e.g., μ ± 1.96σ for a 95%/95% normal TI), which is narrower than the initial tolerance interval because the k-factor reduces with larger n. When TI is most useful: Manufacturing tolerance standards: a machine produces parts with some variation. You want to specify limits that 99% of parts will fall within, with 95% confidence. That's a 95%/99% TI. Regulatory limits: environmental standards often require that 95% of measurements fall within certain limits. TIs help set these limits based on sampling data. Capability studies: in Six Sigma quality improvement, tolerance intervals quantify process capability. Clinical reference ranges: labs establish "normal ranges" for blood tests so that 95% of healthy individuals fall within the range. These are tolerance intervals. Key insight: a TI is useful when you need to say "this range covers a proportion of the population," not just the mean (CI) or a single prediction (PI).

Key Points

  • TI formula: x̄ ± k × s (where k is from tolerance tables)
  • k depends on confidence level, coverage proportion, and sample size
  • Widest of the three intervals
  • Used for process capability, reference ranges, quality standards
  • Approaches μ ± 1.96σ (for 95%/95%) as n → ∞

5. Choosing the Right Interval Type

Decision framework: Question you're asking → Interval type: "What's the true average [X] in the population?" → CI "What range contains the next individual [X]?" → PI "What range contains most (e.g., 95%) of the population?" → TI Common scenarios and the appropriate interval: Research study: comparing two group means → CI on the mean difference A/B test: comparing conversion rates → CI on the difference in proportions Weather forecast: predicting tomorrow's high temperature → PI Individual patient outcome prediction → PI Quality control: is the next part within spec? → PI (for single unit) or process control chart Environmental monitoring: are 95% of samples below pollution limit? → TI (to verify compliance with regulatory limits) Blood test reference range for "normal" → TI Six Sigma capability analysis → TI Forecasting: what will next quarter's revenue be? → PI Survey research: estimating population percentage → CI Meta-analysis: combining studies → CI (for meta-analyzed effect estimate) Policy analysis: estimating impact of intervention → CI Missed intervals: Using CI when you need PI: - Reporting a 95% CI on mean and calling it a "prediction range" - A physician estimating individual patient prognosis from CI - Saying "95% of future values will fall in this CI" (incorrect) Using PI when you need TI: - Setting manufacturing specifications using a PI (doesn't account for population coverage) - Establishing clinical reference ranges using PI (misses population variability) Using TI when you need PI: - Predicting a single future observation using TI (unnecessarily wide) - Setting a prediction target using TI The price of using the wrong interval: CI instead of PI → interval too narrow → actual observations outside it more often than stated → false confidence PI instead of TI → interval doesn't correctly represent population coverage → may fail regulatory or safety requirements TI instead of CI → interval too wide → inefficient hypothesis tests → lower statistical power If you're unsure: ask "is my question about a parameter (CI), a single observation (PI), or a proportion of the population (TI)?" The answer determines the interval type.

Key Points

  • Parameter question → CI
  • Single observation question → PI
  • Proportion of population question → TI
  • Using wrong interval misrepresents uncertainty
  • When unsure, explicitly state what question you're answering

6. Common Confusions and Practical Tips

Confusion 1: "95% CI on mean" misinterpreted as "95% of values fall in this range." This is the most common error. A 95% CI on the mean tells you about the mean, not individual values. If someone wants to know where individual samples will fall, give them a PI or TI. Confusion 2: CI and PI levels misinterpreted. "95% confidence" doesn't mean the next observation has a 95% chance of falling in this interval. It means the method used to compute the interval has a 95% long-run coverage. This is a frequentist concept that's non-intuitive. Confusion 3: Using normal-distribution formulas for non-normal data. The formulas shown assume normal (or close to normal) underlying distribution. For heavily skewed data, use non-parametric versions or data transformations. Confusion 4: Forgetting the degrees of freedom. The t-distribution critical value depends on degrees of freedom. Don't use Z = 1.96 for small samples (n < 30). Use t-values from t-distribution tables. Confusion 5: Confidence level vs. coverage proportion for TI. TI has two parameters: - Confidence level (γ): how confident you are (e.g., 95%) - Coverage (p): what proportion of population you want to cover (e.g., 95%) A 95%/95% TI: 95% confident that 95% of population is within interval. A 99%/95% TI: 99% confident that 95% of population is within interval. A 95%/99% TI: 95% confident that 99% of population is within interval. These are different. State both. Confusion 6: Sample size effects. - CI width: decreases with √n (goes to 0 as n → ∞) - PI width: approaches finite value (±1.96σ) as n → ∞ - TI width: approaches finite value but slowly (depends on coverage) You can shrink a CI with more data, but you can never shrink a PI below the natural variability of individual observations. Collecting 1,000,000 water samples won't let you predict a single future sample more precisely than ±1.96σ. Confusion 7: One-sided vs. two-sided intervals. All intervals can be one-sided or two-sided. Use one-sided when you only care about one direction (e.g., "95% confident that population mean is below X"). Two-sided is more common by default. Confusion 8: Interpretation in Bayesian vs. Frequentist frameworks. In Bayesian analysis, "credible intervals" replace confidence intervals. Bayesian intervals have a more intuitive interpretation: "95% probability the parameter is in this range given the data." Frequentist CIs don't have this interpretation. Practical tips: 1. Always specify the interval type (CI, PI, TI) in reporting. 2. Always specify the confidence level (and coverage for TI). 3. For small samples, use t-based (not Z-based) formulas. 4. Check normality assumption before applying these formulas. 5. For non-normal data, use bootstrap methods or non-parametric alternatives. 6. Report alongside point estimates and sample sizes. 7. Use visualizations (forest plots, confidence bands) to communicate uncertainty. 8. Explain in plain language what the interval means for your audience. This content is for educational purposes only and does not constitute statistical advice.

Key Points

  • Never confuse "95% CI on mean" with "where 95% of samples fall"
  • TI has both confidence level AND coverage proportion
  • CI shrinks with √n; PI and TI approach finite widths
  • For small samples, use t-distribution (not Z)
  • Bayesian credible intervals differ in interpretation from frequentist CIs

Key Takeaways

  • CI: uncertainty about parameter; PI: uncertainty about single observation; TI: proportion of population
  • CI formula: x̄ ± t × (s / √n)
  • PI formula: x̄ ± t × s × √(1 + 1/n)
  • TI formula: x̄ ± k × s (k from tolerance tables)
  • CI narrowest; PI wider; TI widest
  • As n → ∞: CI shrinks to zero; PI approaches ±1.96σ
  • TI has two parameters: confidence level AND coverage proportion
  • Choose based on question: parameter, observation, or proportion
  • Normal distribution assumption required for basic formulas
  • Bayesian alternatives have different interpretation

Practice Questions

1. A research study estimates mean blood pressure from 30 patients: x̄ = 120 mmHg, s = 15 mmHg. Calculate the 95% CI for the population mean.
SE = 15/√30 = 2.74. t_{0.025, 29} ≈ 2.045. ME = 2.045 × 2.74 = 5.60. 95% CI = 120 ± 5.60 = (114.4, 125.6) mmHg. Interpretation: we are 95% confident the true mean blood pressure in this population is between 114.4 and 125.6 mmHg.
2. Using the same data, calculate the 95% PI for a single patient's blood pressure.
PI = x̄ ± t × s × √(1 + 1/n) = 120 ± 2.045 × 15 × √(1 + 1/30) = 120 ± 2.045 × 15 × 1.0165 = 120 ± 31.18. 95% PI = (88.82, 151.18) mmHg. Interpretation: we are 95% confident the next patient's blood pressure will fall between 88.82 and 151.18 mmHg. Much wider than CI.
3. Why does the PI become much wider than the CI?
Because the PI accounts for two sources of uncertainty: (1) uncertainty about where the true mean is (captured by 1/n term), and (2) variability of individual observations around the true mean (captured by the 1 term in √(1 + 1/n)). The CI only captures uncertainty about the mean. For a single observation, individual variability adds substantially — often an order of magnitude more than the mean estimation uncertainty.
4. A manufacturer measures 25 widget weights: x̄ = 10.5g, s = 0.2g. They want 95% confidence that 95% of widgets fall within their reported tolerance range. What interval type and what range?
This is a tolerance interval (TI) question. With n = 25, 95%/95% two-sided TI, k ≈ 2.45 from tables. TI = 10.5 ± 2.45 × 0.2 = 10.5 ± 0.49 = (10.01, 10.99) g. They can report that with 95% confidence, 95% of widgets fall between 10.01g and 10.99g. CI or PI would be wrong for this manufacturing quality question.
5. A research team wants to publish their findings. They estimate the average height is 170 cm with a 95% CI of (168, 172). Someone asks: "Does this mean 95% of people in the population have heights between 168 and 172?" What's the correct answer?
No. The CI (168, 172) refers to the POPULATION MEAN, not individual people. Individual heights have a much wider range due to person-to-person variability. If researchers want to state where 95% of people fall, they should calculate a tolerance interval (TI) — which would be substantially wider. A common mistake in scientific communication.

Study with AI

Get personalized help and instant answers anytime.

Download StatsIQ

FAQs

Common questions about this topic

No, not in frequentist statistics. A 95% CI means: if we repeated the sampling many times and computed intervals each time, 95% of those intervals would contain the true parameter. It is NOT a 95% probability that this specific interval contains the parameter. In Bayesian statistics, "credible intervals" do have the probability interpretation — but these are different from frequentist CIs even though they're often visually similar.

Because the PI accounts for two sources of uncertainty: (1) uncertainty about where the true mean is (this is small if n is large), and (2) natural variability of individual observations around the true mean (this is fundamentally tied to σ and doesn't go away with more data). Individual observations can be far from the mean even when the mean is known perfectly. The PI captures both uncertainties; the CI only captures the first.

Use TI when your question is about a PROPORTION of the population — "what range contains 95% of widgets" or "what range contains 99% of healthy patients." Use PI when your question is about ONE future observation — "what range contains the next widget's weight" or "what will this patient's cholesterol be?" TI is about population coverage; PI is about single-point prediction.

The formulas shown assume normal distribution. For moderately skewed data, you can: (1) transform the data (log, square root) to make it more normal, then apply formulas on transformed data; (2) use bootstrap methods for non-parametric intervals; (3) use the central limit theorem for CI on the mean (works for large samples even if data isn't normal); or (4) use distribution-specific formulas (e.g., gamma distribution for right-skewed data). For heavily non-normal data, bootstrap approaches are common.

At same confidence level and same data: PI is typically 4-5× wider than CI for small samples (n < 30). As n grows, CI shrinks while PI approaches a finite width (about ±1.96σ). This makes PI insensitive to sample size beyond a certain point, while CI keeps shrinking. The ratio PI/CI decreases with sample size: for n = 100, PI might be 10× CI width; for n = 10,000, PI might still be ±1.96σ ≈ 2× the CI. Collecting more data helps CI but has limited impact on PI.

Yes. Describe your data, the question you're asking, and the confidence level needed. StatsIQ identifies which interval type is appropriate (CI, PI, or TI), verifies the distribution assumptions, calculates the interval, and explains the interpretation. Also handles non-normal data with appropriate alternative methods (bootstrap, transformations). Provides clear language for reporting intervals to technical and non-technical audiences. This content is for educational purposes only and does not constitute statistical advice.

More Study Guides