Binomial vs Poisson Distribution: When to Use Each
A focused comparison of two of the most common discrete probability distributions: the binomial (fixed number of trials, two outcomes) and the Poisson (count of events in a fixed interval, rate-based). Covers the assumptions, formulas, when each fits the data, and the Poisson-as-binomial-limit relationship.
What You'll Learn
- ✓State the formal assumptions of binomial and Poisson distributions
- ✓Identify which fits a given data scenario
- ✓Compute probabilities under each
- ✓Explain the Poisson-as-binomial-limit relationship
- ✓Apply both to worked examples in QA, website traffic, and biology
1. Binomial Distribution: Fixed Trials, Two Outcomes
The binomial distribution models the number of successes in a fixed number of independent trials, each with the same probability of success. Three required assumptions: (1) Fixed number of trials, n. (2) Each trial has only two possible outcomes (success or failure). (3) Independent trials with constant probability p. Probability mass function. P(X = k) = C(n,k) × p^k × (1-p)^(n-k), where C(n,k) is the binomial coefficient "n choose k." Mean = np. Variance = np(1-p). Classic examples. (1) Number of heads in 10 coin flips (n=10, p=0.5). (2) Number of defective items in a sample of 50 (n=50, p=defect rate). (3) Number of customers who purchase in a sample of 100 visitors (n=100, p=conversion rate). All three: fixed sample size, binary outcome, constant probability per trial. When binomial fits poorly. (1) If the probability varies trial-to-trial (heterogeneous outcomes). (2) If trials are not independent (clustered customer behavior, repeated measurements on same units). (3) If the number of trials is not fixed in advance.
Key Points
- •Binomial: fixed trials n, binary outcome, constant p
- •P(X=k) = C(n,k) × p^k × (1-p)^(n-k)
- •Mean = np, Variance = np(1-p)
- •Classic: coin flips, defects in a batch, conversions in a sample
- •Fails when: heterogeneous p, dependent trials, variable n
2. Poisson Distribution: Count of Events in an Interval
The Poisson distribution models the number of events occurring in a fixed interval (time, space, volume) when events occur independently at a constant average rate. Three required assumptions: (1) Events occur independently. (2) Average rate (lambda) is constant. (3) Probability of two events at exactly the same instant is zero. Probability mass function. P(X = k) = (lambda^k × e^(-lambda)) / k!, where lambda is the average number of events per interval. Mean = lambda. Variance = lambda (these are equal — a defining feature of the Poisson distribution). Classic examples. (1) Number of customer arrivals per hour at a retail store. (2) Number of typos per page in a manuscript. (3) Number of phone calls received per minute at a help desk. (4) Number of radioactive decays per second from a sample. All four: counting events in a fixed interval, constant average rate, independent occurrences. When Poisson fits poorly. (1) If the rate varies over time (overdispersion). (2) If events cluster (positive correlation). (3) If variance >> mean in the data (overdispersion). Common sign of misfit: in real data, variance often exceeds the Poisson mean, suggesting a negative binomial distribution (which has a separate variance parameter) may fit better.
Key Points
- •Poisson: count of events in a fixed interval at constant rate
- •P(X=k) = (lambda^k × e^(-lambda)) / k!
- •Mean = Variance = lambda (defining feature)
- •Classic: arrivals, defects, calls, decays
- •Fails when: variance >> mean (use negative binomial instead)
3. Binomial vs Poisson: Side-by-Side Comparison
The two distributions are related but applied to different problems. | Feature | Binomial | Poisson | |---|---|---| | Models | Number of successes in n trials | Number of events in interval | | Required parameter(s) | n (trials), p (success prob) | lambda (rate) | | Mean | np | lambda | | Variance | np(1-p) | lambda | | Mean = Variance? | No (unless p=0.5) | Yes (defining feature) | | Domain | 0, 1, 2, ..., n | 0, 1, 2, ... (unbounded) | | Sum of independent | Binomial(n1+n2, p) if same p | Poisson(lambda1+lambda2) | | Common in | Quality control, conversion testing | Arrivals, defects, counts | Key conceptual differences. Binomial has a hard cap at n (cannot have more successes than trials). Poisson is unbounded — in principle 1,000,000 events could occur, just with vanishingly small probability. Binomial requires knowing both n and p separately; Poisson collapses these into a single rate parameter.
Key Points
- •Binomial: fixed n, requires both n and p
- •Poisson: unbounded, single parameter lambda
- •Mean and variance equal under Poisson only
- •Binomial sums require same p
- •Poisson sums combine rates linearly
4. The Poisson-as-Binomial-Limit Relationship
When n is large and p is small (with np remaining moderate), the binomial distribution approaches the Poisson distribution. Rule of thumb: when n > 20 and p < 0.05 (or n > 100 and p < 0.1), Poisson approximation works well with lambda = np. Proof intuition. As n grows and p shrinks (with np = lambda held constant), the binomial PMF converges to the Poisson PMF. This is why Poisson is sometimes called "the distribution of rare events." Practical use. (1) Computing rare-event probabilities. Defect rate of 0.001 with sample size 10,000 — exact binomial computation involves C(10,000, k) which is computationally heavy. Poisson approximation with lambda = 10 gives the same answer with much simpler math. (2) Manufacturing QA. (3) Counting rare adverse events in clinical trials. The relationship also informs which to use when both could apply. If trials are clearly enumerated (1000 widgets, each can be defective), binomial is the natural choice. If events are arrivals or counts (defects in a continuous strip of fabric), Poisson is natural. When trials are enumerated AND many AND p is small, the choice is mathematically equivalent — use whichever is easier to compute.
Key Points
- •Poisson approximates binomial when n is large and p is small
- •Rule of thumb: n > 20, p < 0.05 OR n > 100, p < 0.1
- •Set lambda = np for Poisson approximation
- •Poisson is "distribution of rare events"
- •Computational benefit for large n with small p
5. How StatsIQ Helps With Distribution Choice
Snap a photo of any count data and StatsIQ identifies which distribution fits best (binomial, Poisson, negative binomial, geometric, others). The app computes probabilities under each, runs goodness-of-fit tests, and flags overdispersion that would suggest a different distribution. For study design, StatsIQ produces simulation-based power analyses for both binomial and Poisson outcomes. For exam prep, the app generates problems at varying difficulty levels covering all common discrete distributions. This content is for educational purposes only.
Key Points
- •Identifies best-fitting distribution from data
- •Computes probabilities under each candidate
- •Runs goodness-of-fit tests
- •Flags overdispersion (Poisson assumption violation)
- •Simulation-based power analyses for design
Key Takeaways
- ★Binomial: fixed n trials, binary outcome, constant p
- ★Binomial PMF: P(X=k) = C(n,k) × p^k × (1-p)^(n-k)
- ★Binomial mean = np, variance = np(1-p)
- ★Poisson: count of events in fixed interval at constant rate
- ★Poisson PMF: P(X=k) = (lambda^k × e^(-lambda)) / k!
- ★Poisson mean = variance = lambda (defining feature)
- ★Binomial domain: 0, 1, ..., n (bounded)
- ★Poisson domain: 0, 1, 2, ... (unbounded)
- ★Poisson approximates binomial when n large, p small
- ★Rule of thumb: n > 20, p < 0.05, set lambda = np
- ★Overdispersion (variance >> mean) → consider negative binomial
- ★Sum of independent Poissons combines rates linearly
Practice Questions
1. A factory produces 1,000 widgets with a 2% defect rate. What distribution models the number of defective widgets in this batch? What is the expected number?
2. A help desk receives 12 calls per hour on average. What is the probability of receiving exactly 8 calls in a given hour?
3. How does the binomial differ when n is large and p is small versus moderate?
4. A manufacturer reports a defect rate of 5 per 1,000 units. Over a production run of 200 units, what distribution models the number of defects, and what is the expected count?
5. Sample data show count variance much higher than count mean. What does this suggest about the distributional choice?
FAQs
Common questions about this topic
It is a mathematical property of the distribution. The PMF P(X=k) = lambda^k × e^(-lambda) / k! has the property that E[X] = lambda and Var(X) = lambda. This is built into the distribution by design — Poisson was constructed to model a memoryless arrival process where the variance scales linearly with the mean. The equality is so distinctive that it serves as a quick check: if your data has variance much larger than mean, Poisson does not fit and a different distribution should be considered.
When count data shows overdispersion — variance exceeds mean. Negative binomial has two parameters (mean and dispersion) and can model variance > mean. Common applications: insurance claim counts, internet traffic packets, microbiome species counts. The negative binomial regression model is the standard alternative to Poisson regression for overdispersed counts. Software (R, Stata, Python) supports both.
Absolutely. Real count data often violates both. Heterogeneous rates, time-varying intensities, and clustering can all break both assumptions. Other discrete distributions to consider: negative binomial (overdispersion), geometric (time to first success), hypergeometric (sampling without replacement), zero-inflated Poisson (excess zeros). For mixed processes, mixture models or hierarchical Bayesian approaches may be needed. The principle: distribution choice is data-driven, not assumed.
Both binomial and Poisson can be approximated by the normal distribution when their parameters are large. Binomial approximates Normal(np, np(1-p)) when n is large and p is not too extreme (rule of thumb: np > 5 AND n(1-p) > 5). Poisson approximates Normal(lambda, lambda) when lambda > 10. The normal approximation is most useful for continuity-corrected probability calculations and for hypothesis testing. Software handles these conversions automatically.
The binomial coefficient C(n,k) — also written as "n choose k" or n!/(k!(n-k)!) — counts the number of ways to choose k items from n. It is a combinatorial quantity, not a distribution. The binomial distribution USES the binomial coefficient in its PMF formula, but the two are different things. Coefficient: how many ways. Distribution: probability of each count of successes given the success probability.
Snap a photo of any count data and StatsIQ identifies which distribution fits best (binomial, Poisson, negative binomial, geometric, others), computes probabilities under each candidate, runs goodness-of-fit tests, and flags overdispersion. For exam prep, the app generates problems at varying difficulty levels covering all common discrete distributions. This content is for educational purposes only.