🧮

Statistics Formulas

15 essential statistics formulas with detailed explanations, examples, and when to use them.

📊descriptive

Sample Mean

x̄ = Σxᵢ / n

The sample mean is the arithmetic average of a set of observations. It serves as the most common measure of central tendency and is used as a point estimator for the population mean. The sample mean minimizes the sum of squared deviations from any constant.

📏descriptive

Sample Standard Deviation

s = √[Σ(xᵢ - x̄)² / (n - 1)]

The sample standard deviation measures the average amount of variability or spread in a data set. It quantifies how far individual observations tend to fall from the sample mean. The denominator uses n - 1 (Bessel's correction) to provide an unbiased estimate of the population variance.

🎯descriptive

Z-Score

z = (x - μ) / σ

The z-score measures how many standard deviations a data point is above or below the population mean. It standardizes values from any normal distribution to the standard normal distribution (mean 0, standard deviation 1), enabling direct comparison across different scales and units.

🔒inference

Confidence Interval for the Mean

x̄ ± z*(σ/√n) or x̄ ± t*(s/√n)

A confidence interval provides a range of plausible values for the population mean based on sample data. When the population standard deviation is known, use the z-interval; when it is unknown and estimated by s, use the t-interval. The confidence level (e.g., 95%) reflects how often the procedure produces intervals that contain the true mean.

🧪inference

One-Sample T-Test Statistic

t = (x̄ - μ₀) / (s / √n)

The one-sample t-test statistic measures how far the sample mean is from a hypothesized population mean, in units of the estimated standard error. It follows a t-distribution with n - 1 degrees of freedom and is used when the population standard deviation is unknown.

🔢inference

Chi-Square Statistic

χ² = Σ(O - E)² / E

The chi-square statistic measures the discrepancy between observed and expected frequencies. It is used in goodness-of-fit tests (does data follow a hypothesized distribution?) and tests of independence (are two categorical variables related?). Larger values of χ² indicate greater deviation from what was expected.

🔗regression

Pearson Correlation Coefficient

r = Σ(xᵢ - x̄)(yᵢ - ȳ) / √[Σ(xᵢ - x̄)² Σ(yᵢ - ȳ)²]

The Pearson correlation coefficient measures the strength and direction of the linear relationship between two quantitative variables. It ranges from -1 (perfect negative linear relationship) through 0 (no linear relationship) to +1 (perfect positive linear relationship).

📈regression

Linear Regression Slope

b₁ = Σ(xᵢ - x̄)(yᵢ - ȳ) / Σ(xᵢ - x̄)²

The least-squares regression slope quantifies the average change in the response variable (y) for a one-unit increase in the explanatory variable (x). It is the slope of the best-fitting straight line that minimizes the sum of squared residuals.

🎲probability

Binomial Probability

P(X = k) = C(n,k) × pᵏ × (1 - p)ⁿ⁻ᵏ

The binomial probability formula calculates the probability of getting exactly k successes in n independent Bernoulli trials, each with the same probability of success p. It is one of the most important discrete probability distributions in statistics.

🔔distribution

Normal Distribution PDF

f(x) = (1/σ√2π) × e^(-(x - μ)² / 2σ²)

The probability density function of the normal (Gaussian) distribution describes the bell-shaped curve that arises throughout statistics. It is fully characterized by its mean (μ) and standard deviation (σ). The area under the curve between any two values gives the probability of an observation falling in that range.

🔄probability

Bayes' Theorem

P(A|B) = P(B|A) × P(A) / P(B)

Bayes' theorem provides a way to update the probability of a hypothesis (A) after observing evidence (B). It connects the prior probability P(A) with the likelihood P(B|A) to produce the posterior probability P(A|B). It is foundational in Bayesian statistics, medical diagnosis, and machine learning.

📐inference

Margin of Error

ME = z* × (σ / √n)

The margin of error quantifies the maximum expected difference between the sample estimate and the true population parameter at a given confidence level. It is the half-width of a confidence interval and determines the precision of the estimate. Smaller margins of error require larger sample sizes.

📋inference

F-Statistic (ANOVA)

F = MSB / MSW

The F-statistic in one-way ANOVA (Analysis of Variance) is the ratio of the mean square between groups (MSB) to the mean square within groups (MSW). It tests whether the means of three or more groups are all equal. A large F indicates that at least one group mean differs significantly from the others.

🎯regression

Coefficient of Determination

R² = 1 - SS_res / SS_tot

The coefficient of determination (R-squared) measures the proportion of variance in the dependent variable that is explained by the independent variable(s) in a regression model. It ranges from 0 to 1, where 1 means the model perfectly explains all variability and 0 means it explains none.

🎰distribution

Poisson Probability

P(X = k) = (λᵏ × e^(-λ)) / k!

The Poisson distribution models the probability of a given number of events occurring in a fixed interval of time or space, when events happen independently at a constant average rate. It is commonly used for rare events and count data.

Calculate Any Formula with AI

Snap a photo or type your problem for instant solutions.

Download StatsIQ