Binomial Distribution
The binomial distribution models the number of successes in a fixed number of independent Bernoulli trials, each with the same probability of success. It is one of the most commonly used discrete distributions in statistics and is fundamental to hypothesis testing for proportions. The distribution arises whenever you count the number of times an event occurs in a fixed number of independent attempts.
Formula
P(X = k) = C(n, k) · p^k · (1 - p)^(n - k), where C(n, k) = n! / (k!(n - k)!)
Mean (Expected Value)
np
Variance
np(1 - p)
Parameters
The fixed number of independent trials performed. Must be a positive integer (n ≥ 1).
The probability of success on each individual trial. Must satisfy 0 ≤ p ≤ 1.
Key Properties
- •Each trial is independent with exactly two outcomes: success (probability p) or failure (probability 1 - p)
- •X can take integer values from 0 to n
- •When p = 0.5, the distribution is symmetric; when p < 0.5 it is right-skewed; when p > 0.5 it is left-skewed
- •The sum of independent binomial random variables with the same p is also binomial: if X ~ Bin(n₁, p) and Y ~ Bin(n₂, p), then X + Y ~ Bin(n₁ + n₂, p)
- •For large n, can be approximated by the normal distribution N(np, np(1 - p)) when np ≥ 10 and n(1 - p) ≥ 10
Example
A multiple-choice exam has 20 questions, each with 5 choices. A student guesses randomly on every question. What is the probability that the student gets exactly 6 questions correct?
Here n = 20, p = 1/5 = 0.2, and k = 6. P(X = 6) = C(20, 6) · (0.2)^6 · (0.8)^14 = 38760 · 0.000064 · 0.04398 = 38760 · 2.8147 × 10^⁻⁶ ≈ 0.1091.
Result: P(X = 6) ≈ 0.1091, or about 10.91%
There is approximately a 10.91% chance the student gets exactly 6 out of 20 questions correct by random guessing. The expected number correct is np = 20 × 0.2 = 4, so getting 6 is above average but not exceptionally unlikely.
When to Use
- ✓When counting successes in a fixed number of independent trials with constant success probability (e.g., coin flips, defective items in a batch with replacement)
- ✓When modeling yes/no outcomes across repeated experiments (pass/fail, accept/reject, heads/tails)
- ✓When performing hypothesis tests for population proportions
- ✓When modeling the number of correct answers on a multiple-choice test from random guessing
Common Mistakes
- ✗Using the binomial distribution when trials are not independent. If sampling without replacement from a finite population, use the hypergeometric distribution instead.
- ✗Forgetting that the binomial counts the total number of successes, not the probability of success on a specific trial.
- ✗Confusing P(X = k) with P(X ≤ k). Finding cumulative probabilities requires summing: P(X ≤ k) = ∑ P(X = i) for i = 0 to k.
- ✗Applying the normal approximation when np or n(1 - p) is less than 10, which leads to poor accuracy especially in the tails.
Need Help with Distribution Problems?
Snap a photo of any distribution problem for instant step-by-step solutions.
Download StatsIQFAQs
Common questions about Binomial Distribution
The four conditions (often remembered as BINS) are: (1) Binary -- each trial has exactly two outcomes (success or failure). (2) Independent -- the outcome of one trial does not affect another. (3) Number -- there is a fixed number of trials, n. (4) Success probability -- the probability of success, p, is the same for every trial. If any condition is violated, the binomial model is not appropriate.
To find P(X ≤ k), sum the individual probabilities: P(X ≤ k) = P(X = 0) + P(X = 1) + ... + P(X = k). For P(X ≥ k), use the complement: P(X ≥ k) = 1 - P(X ≤ k - 1). Most calculators and statistical software have a cumulative binomial function (e.g., binomcdf on TI calculators, pbinom in R, or binom.cdf in Python). For large n, you can use the normal approximation with continuity correction.
A Bernoulli distribution is a special case of the binomial distribution with n = 1. It models a single trial with probability p of success and 1 - p of failure. The binomial distribution with parameters n and p is equivalent to the sum of n independent Bernoulli(p) random variables. So Bernoulli is for a single yes/no outcome, while binomial counts how many yes outcomes occur across multiple trials.