Geometric Distribution
The geometric distribution models the number of independent Bernoulli trials needed to achieve the first success. It is the discrete analog of the exponential distribution and the only discrete distribution with the memoryless property. The geometric distribution is commonly used in quality control to model the number of inspections until the first defective item is found, and in reliability to model the number of uses until first failure.
Formula
P(X = k) = (1 - p)^(k - 1) ยท p, for k = 1, 2, 3, ... (counting the trial of the first success)
Mean (Expected Value)
1/p
Variance
(1 - p) / pยฒ
Parameters
The probability of success on each independent trial. Must satisfy 0 < p โค 1. A smaller p means more trials are expected before the first success.
Key Properties
- โขModels the number of trials until (and including) the first success in a sequence of independent Bernoulli trials
- โขMemoryless property: P(X > s + t | X > s) = P(X > t). Past failures do not change the probability of future success.
- โขThe only discrete distribution with the memoryless property
- โขX can take any positive integer value with no upper bound: the support is {1, 2, 3, ...}
- โขThe CDF has a closed form: P(X โค k) = 1 - (1 - p)^k
- โขSome textbooks define the geometric as the number of failures before the first success, with support {0, 1, 2, ...} and mean (1 - p)/p
Example
A basketball player has a free-throw success rate of 80%. What is the probability that she makes her first free throw on the 3rd attempt?
Here p = 0.80 and k = 3. P(X = 3) = (1 - 0.80)^(3-1) ยท 0.80 = (0.20)^2 ยท 0.80 = 0.04 ยท 0.80 = 0.032.
Result: P(X = 3) = 0.032, or 3.2%
There is only a 3.2% chance that the player misses the first two free throws and then makes the third. This is quite unlikely for an 80% shooter. The expected number of attempts until the first success is 1/0.80 = 1.25 attempts, so she usually makes it on the first try.
When to Use
- โWhen counting the number of trials until the first success in a series of independent Bernoulli trials (number of attempts until passing an exam, number of sales calls until a sale)
- โWhen modeling 'how long until an event happens for the first time' in discrete time
- โIn quality control to model the number of items inspected until the first defective item
- โWhen the memoryless property is appropriate in a discrete setting (each trial is identical and independent of past outcomes)
Common Mistakes
- โConfusing the two parameterizations: some books define X as the number of trials until success (support starts at 1, mean = 1/p), while others define it as the number of failures before success (support starts at 0, mean = (1-p)/p). Always check which convention your course uses.
- โApplying the geometric distribution when trials are not independent. If the probability of success changes from trial to trial, the geometric model is not appropriate.
- โConfusing the geometric distribution with the binomial. Geometric has a random number of trials and exactly one success; binomial has a fixed number of trials and counts total successes.
- โForgetting the memoryless property when solving conditional problems. P(X > 5 | X > 3) = P(X > 2), not a more complex calculation.
Need Help with Distribution Problems?
Snap a photo of any distribution problem for instant step-by-step solutions.
Download StatsIQFAQs
Common questions about Geometric Distribution
The key difference is what is fixed and what is random. In the binomial distribution, the number of trials n is fixed, and you count the random number of successes. In the geometric distribution, the number of successes is fixed at 1, and you count the random number of trials needed. Binomial answers 'how many successes in n trials?' while geometric answers 'how many trials until the first success?' They share the assumption of independent trials with constant success probability p.
The geometric distribution is the discrete counterpart of the exponential distribution. Both are memoryless: the geometric is the only memoryless discrete distribution, and the exponential is the only memoryless continuous distribution. If events occur at discrete time steps with probability p per step, the waiting time follows a geometric distribution. If events occur continuously at rate ฮป, the waiting time follows an exponential distribution. As the time step shrinks and p โ 0 proportionally, the geometric approaches the exponential.
Using the complement: P(X > k) = 1 - P(X โค k) = 1 - [1 - (1-p)^k] = (1-p)^k. This has an intuitive interpretation: P(X > k) is the probability of k consecutive failures, which is (1-p)^k. For example, if p = 0.3, P(X > 4) = (0.7)^4 = 0.2401. This means there is a 24.01% chance that the first four trials all fail, so the first success has not occurred yet.