๐Ÿ“ˆ
fundamentalsintermediate25-35 minutes

Discrete vs Continuous Distributions: How to Choose

A practical guide to choosing between discrete and continuous probability distributions: the conceptual difference, four common discrete distributions, four common continuous distributions, decision criteria for each, and a flowchart for matching data to distribution.

What You'll Learn

  • โœ“Distinguish discrete from continuous random variables
  • โœ“Identify four common discrete distributions and their uses
  • โœ“Identify four common continuous distributions and their uses
  • โœ“Apply decision criteria for distribution choice
  • โœ“Match data scenarios to appropriate distributions

1. The Conceptual Difference

A discrete random variable takes specific separable values โ€” counts, integers, categories. Examples: number of customer arrivals per hour (0, 1, 2, ...), number of defective products in a batch, number of heads in 10 coin flips. The variable can only assume specific values; there is no continuum. A continuous random variable takes any value within a range. Examples: height in cm (170.5, 170.51, 170.515, ...), time to complete a task, temperature, financial returns. Between any two values, infinitely many intermediate values exist. This distinction drives the probability function form. Discrete distributions have a probability mass function (PMF) โ€” P(X = k) gives the probability that the variable equals each specific value. Continuous distributions have a probability density function (PDF) โ€” f(x) gives density, and probabilities are computed as areas under the curve over intervals: P(a < X < b) = integral of f(x) from a to b. P(X = exactly some specific value) = 0 for continuous distributions (the area at a single point is zero). Mixed distributions exist (e.g., a variable that is sometimes zero and sometimes continuous) but are less common in introductory statistics. The discrete/continuous distinction handles the vast majority of practical cases.

Key Points

  • โ€ขDiscrete: separable values (counts, integers, categories)
  • โ€ขContinuous: any value within a range (heights, times, temperatures)
  • โ€ขDiscrete: PMF gives P(X = k) for each specific value
  • โ€ขContinuous: PDF gives density; P(a < X < b) = integral of density
  • โ€ขP(X = exact value) = 0 for continuous distributions

2. Four Common Discrete Distributions

Binomial. Number of successes in n independent trials with constant probability p. Parameters: n, p. Mean = np. Variance = np(1-p). Use for fixed-sample-size scenarios with binary outcomes: defects in a batch, conversions in a sample, heads in coin flips. Poisson. Number of events in a fixed interval at constant rate. Parameter: lambda. Mean = Variance = lambda. Use for arrival, defect, or count data where events occur independently at a constant rate: customer arrivals per hour, defects per page, calls per minute. Approximates binomial when n is large and p is small (set lambda = np). Geometric. Number of trials until the first success. Parameter: p. Mean = 1/p. Use for "time to first success" scenarios with constant probability: number of customer calls before the first sale, number of submissions before acceptance. Negative Binomial. Number of trials until r successes (or number of failures before r successes). Parameters: r, p. Use when extending geometric to multiple required successes, OR as overdispersion-tolerant alternative to Poisson when variance >> mean.

Key Points

  • โ€ขBinomial: number of successes in fixed n trials with constant p
  • โ€ขPoisson: count of events in interval at constant rate
  • โ€ขGeometric: number of trials until first success
  • โ€ขNegative Binomial: number of trials until r successes OR overdispersed Poisson alternative
  • โ€ขAll four are countable / integer-valued

3. Four Common Continuous Distributions

Normal (Gaussian). Symmetric bell curve. Parameters: mean (mu), SD (sigma). Use for heights, weights, test scores, measurement errors. Most natural phenomena approximate normal, and Central Limit Theorem produces normal sampling distributions for sample means. Exponential. Time between events in a Poisson process. Parameter: rate (lambda). Mean = 1/lambda. Variance = 1/lambda^2. Use for time-between-arrivals at a service counter, time between failures of components. Memoryless property: P(X > t + s | X > s) = P(X > t) โ€” the wait time does not depend on how long you have already waited. Log-normal. The log of the variable is normally distributed. Use for variables that are always positive and right-skewed: income, stock prices, file sizes, sound intensity. If Y is log-normal, log(Y) is normal โ€” many transformations exploit this. Uniform. Equal probability across a range [a, b]. Parameter: a, b. Mean = (a+b)/2. Variance = (b-a)^2 / 12. Use for random number generation, scenarios with no prior preference for any value in a range. Limited real-world use but appears in simulations and Bayesian priors.

Key Points

  • โ€ขNormal: symmetric bell, most natural phenomena and CLT
  • โ€ขExponential: time between events in Poisson process (memoryless)
  • โ€ขLog-normal: positive right-skewed (income, prices, sizes)
  • โ€ขUniform: equal probability across a range
  • โ€ขAll four are real-valued / continuous on their support

4. Decision Flowchart

Step 1: Is the variable a count or a measurement? Counts = discrete. Measurements = continuous (or treated as continuous for large enough scales). Step 2 (discrete): How are events generated? Fixed trials with binary outcomes โ†’ Binomial. Events in fixed interval at constant rate โ†’ Poisson. Time to first success โ†’ Geometric. Time to r successes OR overdispersed counts โ†’ Negative Binomial. Step 3 (continuous): What is the shape and support? Symmetric, unbounded โ†’ Normal. Positive, right-skewed โ†’ Log-normal. Memoryless time-between-events โ†’ Exponential. Equal probability over a range โ†’ Uniform. Step 4: Run goodness-of-fit checks. Q-Q plots compare observed to theoretical quantiles. Statistical tests (Kolmogorov-Smirnov, Anderson-Darling, Chi-square) test fit. Visual inspection often reveals deviations the eye sees better than tests. Step 5: If no standard distribution fits well, consider transformations (log, sqrt, Box-Cox) or non-parametric methods that do not require distributional assumptions. Permutation tests and bootstrapping are robust alternatives.

Key Points

  • โ€ขStep 1: count or measurement?
  • โ€ขStep 2 (discrete): match generation mechanism to distribution
  • โ€ขStep 3 (continuous): match shape and support to distribution
  • โ€ขStep 4: goodness-of-fit checks (Q-Q plot, KS test, visual)
  • โ€ขStep 5: transformations or non-parametric methods as fallback

5. How StatsIQ Helps With Distribution Choice

Snap a photo of any data sample and StatsIQ identifies the best-fitting distribution from a library of 15+ common distributions, runs goodness-of-fit tests, produces Q-Q plots, and recommends transformations for poor fits. For exam prep, the app generates problems at all complexity levels including distribution-identification questions. StatsIQ also handles applied problems: given the chosen distribution, compute probabilities, percentiles, and sample-size requirements for various inferential goals. This content is for educational purposes only.

Key Points

  • โ€ขIdentifies best-fitting distribution from 15+ candidates
  • โ€ขRuns goodness-of-fit tests
  • โ€ขProduces Q-Q plots and visual diagnostics
  • โ€ขRecommends transformations for poor fits
  • โ€ขComputes probabilities under chosen distribution

Key Takeaways

  • โ˜…Discrete: separable values (counts, integers)
  • โ˜…Continuous: any value in a range (measurements)
  • โ˜…Discrete distributions have PMF: P(X = k)
  • โ˜…Continuous distributions have PDF: f(x), probabilities via integration
  • โ˜…P(X = exact value) = 0 for continuous
  • โ˜…Four common discrete: Binomial, Poisson, Geometric, Negative Binomial
  • โ˜…Four common continuous: Normal, Exponential, Log-normal, Uniform
  • โ˜…Poisson approximates binomial when n large, p small
  • โ˜…Exponential is the continuous analog of geometric (memoryless)
  • โ˜…Log-normal: log(Y) is normal; positive right-skewed variables
  • โ˜…CLT produces normal sampling distributions for sample means
  • โ˜…Goodness-of-fit: Q-Q plot, KS test, Anderson-Darling, Chi-square

Practice Questions

1. A company tracks the number of customer service calls per hour. What distribution would you propose?
Poisson distribution with parameter lambda = average calls per hour. Counts of events occurring at constant rate in fixed intervals is the canonical Poisson scenario. If observed variance is much larger than mean, consider negative binomial.
2. A study measures time between customer arrivals at a service counter. What distribution would you propose?
Exponential distribution with rate parameter lambda. Time between events in a Poisson arrival process is exponential. Memoryless property fits well for unpredictable arrivals.
3. Heights of adult women in a country. What distribution would you propose?
Normal distribution. Heights are continuous, approximately symmetric, and well-modeled by normal in human populations. Use sample mean and SD to specify parameters.
4. Distribution of household income. What distribution would you propose?
Log-normal distribution. Income is positive (never negative), right-skewed (long tail of high earners), and typically well-fit by log-normal. Taking the log of income usually produces an approximately normal distribution suitable for standard inference.
5. Number of trials before a basketball player makes a free throw (with constant make probability). What distribution would you propose?
Geometric distribution with parameter p = make probability. Geometric models the number of trials until the first success in independent Bernoulli trials.

Study with AI

Get personalized help and instant answers anytime.

Download StatsIQ

FAQs

Common questions about this topic

Yes, in some cases. A variable might be mostly zero (a discrete value) and continuously distributed when positive. Example: insurance claim amounts โ€” most policy holders have zero claims, others have continuous claim amounts. This is modeled with a zero-inflated continuous distribution, a mixture model. Software supports these (zero-inflated negative binomial, zero-inflated lognormal, etc.). For introductory statistics, focus on the pure discrete or pure continuous case.

Less than students fear, in many cases. The Central Limit Theorem provides robust normality for sample means with reasonable sample sizes. Many statistical methods are robust to moderate violations. Bootstrap methods are distribution-free. However, with small samples or extreme distributions, the assumption matters more. Always plot the data, run normality tests, and consider transformations or non-parametric alternatives when assumptions clearly fail.

A distribution is the family or shape (normal, Poisson, etc.). Parameters are the specific values that pin down the distribution within its family. Normal with mean = 0 and SD = 1 is different from Normal with mean = 100 and SD = 15 โ€” same family, different parameters. Estimating parameters from data is what most of statistical inference is about: given the family, what are the most likely parameter values?

You can never know with certainty โ€” real data rarely matches any theoretical distribution perfectly. Statistical tests (Kolmogorov-Smirnov, Anderson-Darling, Shapiro-Wilk for normal) test whether the data is "close enough" to the proposed distribution. Visual checks (Q-Q plots, histograms) often reveal deviations the tests cannot. The pragmatic approach: choose a distribution that fits reasonably well, run sensitivity analyses with alternatives, and use robust methods when fit is uncertain.

When the distributional assumption is doubtful or untestable due to small sample size. Non-parametric methods (rank tests, permutation tests, bootstrap) make weaker assumptions and produce valid inference under broader conditions. Trade-off: when the parametric assumption is correct, parametric methods are more powerful. When assumptions are violated, non-parametric methods are safer. A common modern workflow is to run both and check for consistency.

Snap a photo of any data sample and StatsIQ identifies the best-fitting distribution from 15+ candidates, runs goodness-of-fit tests, produces Q-Q plots and visual diagnostics, and recommends transformations for poor fits. The app also computes probabilities under the chosen distribution and produces practice problems for exam prep. This content is for educational purposes only.

Related Study Guides

Browse All Study Guides

๐ŸŽฏ AP Statistics๐Ÿ”ฌ Introduction to๐Ÿ“ˆ Regression Analysis๐ŸŽฒ Probability Foundations๐Ÿ“Š Understanding Statistical๐Ÿงช ANOVA and๐Ÿ“‰ Data Visualization๐Ÿ”„ Bayesian vs๐Ÿ“Š What Is๐Ÿ“ What Is๐Ÿ”— Correlation vs๐Ÿ“ Central Limit๐Ÿ“ Confidence Intervals:๐Ÿ“ P-Values and๐Ÿ“ Chi-Square Testsโš ๏ธ Type I๐ŸŽฒ Sampling Methods๐Ÿ“ˆ Introduction to๐Ÿ“ Effect Size๐Ÿ“‰ Multiple Regression:๐Ÿ”€ Non-Parametric Tests:๐ŸŽฏ How to๐Ÿงช A/B Testing๐Ÿงน Data Cleaningโฑ๏ธ Survival Analysis:๐Ÿ”— Introduction to๐Ÿ“ˆ Time Series๐Ÿ”ฌ Principal Component๐Ÿ”€ How to๐Ÿ“ Two-Sample t-Test๐Ÿ“Š How to๐Ÿ”€ Paired vs๐Ÿ“‹ How to๐Ÿ“Š Z-Scores and๐Ÿ“ˆ R Squared๐ŸŽฒ Binomial Probability๐ŸŽฒ Expected Value๐Ÿ“ Standard Error๐ŸŽฏ Margin of๐Ÿ“Š Contingency Tables๐Ÿ“‰ Poisson Distribution:๐Ÿ“ Cohen's d๐Ÿ”— Pearson vsโš–๏ธ One-Tailed vs๐Ÿ”” Normal Distribution๐Ÿ“‰ Linear Regression๐Ÿ“Š Mean vs๐ŸŽฏ Confidence vs๐Ÿ“Š Two-Way ANOVA:โšก Statistical Power๐ŸŽฏ Conditional Probability๐ŸŽฒ Permutations vs๐Ÿ“ˆ Log Transformations๐Ÿ”„ Simpson's Paradox:๐Ÿงช Hypothesis Testing:๐ŸŽฒ Probability Distributions:๐Ÿ“ˆ Central Limitโš–๏ธ Type I๐ŸŽฏ P-Value Interpretation:โ†”๏ธ One-Tailed vs๐ŸŽฒ Binomial vs๐Ÿ“Š Normal Distribution๐Ÿ“ˆ Discrete vs