📐

Student's t-Distribution

continuous

The Student's t-distribution is a continuous probability distribution that arises when estimating the mean of a normally distributed population using a small sample size and an unknown population standard deviation. It was developed by William Sealy Gosset under the pseudonym 'Student.' The t-distribution has heavier tails than the normal distribution, reflecting the extra uncertainty from estimating the standard deviation. As the degrees of freedom increase, it converges to the standard normal distribution.

Formula

f(t) = [Γ((ν + 1)/2) / (√(νπ) · Γ(ν/2))] · (1 + t²/ν)^(-(ν + 1)/2), where Γ is the gamma function

Mean (Expected Value)

0 (for ν > 1; undefined for ν = 1)

Variance

ν / (ν - 2) (for ν > 2; infinite for 1 < ν ≤ 2; undefined for ν ≤ 1)

Parameters

ν
Degrees of Freedom

Controls the shape of the distribution. Must be a positive number (ν > 0). Typically ν = n - 1 for a one-sample t-test with sample size n. Larger ν makes the distribution closer to normal.

Key Properties

  • Symmetric about 0, resembling the standard normal distribution but with heavier tails
  • As ν → ∞, the t-distribution converges to the standard normal distribution N(0, 1)
  • The variance ν/(ν - 2) is always greater than 1 (for ν > 2), reflecting extra uncertainty
  • With ν = 1, it becomes the Cauchy distribution, which has no defined mean or variance
  • Critical values |t*| are always larger than corresponding |z*| values, producing wider confidence intervals for small samples

Example

A sample of 15 light bulbs has a mean lifetime of 1200 hours with a sample standard deviation of 100 hours. Assuming lifetimes are normally distributed, find the 95% confidence interval for the population mean.

Step 1: Degrees of freedom ν = n - 1 = 15 - 1 = 14. Step 2: For a 95% CI, find t* = t(0.025, 14) = 2.145 from the t-table. Step 3: Margin of error = t* · (s / √n) = 2.145 · (100 / √15) = 2.145 · 25.82 = 55.38. Step 4: CI = x̄ ± ME = 1200 ± 55.38 = (1144.62, 1255.38).

Result: 95% CI: (1144.62, 1255.38) hours

We are 95% confident that the true mean lifetime of this type of light bulb is between 1144.62 and 1255.38 hours. Note that we used the t-distribution instead of the normal because the population standard deviation is unknown and we estimated it from a small sample of 15.

When to Use

  • When constructing confidence intervals for a population mean with unknown population standard deviation, especially with small samples (n < 30)
  • When performing one-sample, two-sample, or paired t-tests for comparing means
  • When the underlying population is approximately normal but σ is estimated by the sample standard deviation s
  • When computing confidence intervals for regression coefficients in linear regression

Common Mistakes

  • Using a z-table instead of a t-table when the population standard deviation is unknown. The z-distribution underestimates the width of confidence intervals for small samples.
  • Using the wrong degrees of freedom. For a one-sample t-test, ν = n - 1. For a two-sample t-test with equal variances, ν = n₁ + n₂ - 2.
  • Applying the t-test to heavily skewed or outlier-prone data with small samples. The t-test assumes approximate normality of the population (or large sample size via CLT).
  • Forgetting that the t-distribution approaches the normal distribution as ν increases. For ν > 30, the difference between t* and z* is often negligible.

Need Help with Distribution Problems?

Snap a photo of any distribution problem for instant step-by-step solutions.

Download StatsIQ

FAQs

Common questions about Student's t-Distribution

Use the t-distribution whenever you are estimating a population mean and the population standard deviation (σ) is unknown, which is almost always the case in practice. The t-distribution accounts for the additional uncertainty introduced by estimating σ with the sample standard deviation s. This is especially important for small samples (n < 30). For large samples, the t and z distributions give nearly identical results, but it is still technically more correct to use the t-distribution.

The heavier tails reflect the additional uncertainty from estimating the population standard deviation. When you use s instead of σ, there is extra variability because s itself is a random variable. The t-distribution compensates by assigning more probability to extreme values. This means t-based confidence intervals are wider and t-based hypothesis tests are more conservative (harder to reject H₀), which is appropriate given the additional uncertainty.

As the sample size n increases, the degrees of freedom ν = n - 1 also increases, and the t-distribution approaches the standard normal distribution. This happens because larger samples provide more precise estimates of σ, reducing the extra uncertainty. By around ν = 30, the t-distribution is very close to N(0, 1), and by ν = 120, they are nearly indistinguishable. This convergence is why z-tests and t-tests give similar results for large samples.

Related Distributions

All Distributions