๐Ÿ“
fundamentalsbeginner20 min

What Is Standard Deviation and How Do You Calculate It?

Learn what standard deviation measures, how to calculate it by hand and interpret it, and why it is the most important measure of spread in statistics.

What You'll Learn

  • โœ“Define standard deviation and explain what it measures
  • โœ“Calculate standard deviation by hand using the formula
  • โœ“Interpret standard deviation in context
  • โœ“Understand the relationship between standard deviation, variance, and the normal distribution

1. What Standard Deviation Measures

Standard deviation measures how spread out data values are from the mean. A small standard deviation means the data points cluster tightly around the mean โ€” the values are consistent. A large standard deviation means the data points are widely scattered โ€” the values vary a lot. It is the most commonly used measure of variability in statistics because it is expressed in the same units as the original data, making it directly interpretable. If the mean exam score is 75 with a standard deviation of 5, most scores fall within a few points of 75. If the standard deviation is 15, scores are much more spread out.

Key Points

  • โ€ขStandard deviation measures how far data points typically fall from the mean
  • โ€ขSmall SD = data clustered near the mean; large SD = data spread widely
  • โ€ขSD is in the same units as the data, unlike variance which is in squared units

2. The Formula and Step-by-Step Calculation

For a population, standard deviation (ฯƒ) = โˆš[ฮฃ(xi - ฮผ)ยฒ / N]. For a sample, standard deviation (s) = โˆš[ฮฃ(xi - xฬ„)ยฒ / (n-1)]. The sample formula divides by n-1 instead of N to correct for bias when estimating a population parameter from a sample. Step-by-step: (1) Calculate the mean. (2) Subtract the mean from each data point to get deviations. (3) Square each deviation. (4) Sum the squared deviations. (5) Divide by N (population) or n-1 (sample). (6) Take the square root. Example with data {2, 4, 6, 8, 10}: Mean = 6. Deviations: -4, -2, 0, 2, 4. Squared: 16, 4, 0, 4, 16. Sum = 40. For a sample: 40/4 = 10. Square root: s โ‰ˆ 3.16.

Key Points

  • โ€ขPopulation SD divides by N; sample SD divides by n-1 (Bessel correction)
  • โ€ขSteps: mean โ†’ deviations โ†’ square โ†’ sum โ†’ divide โ†’ square root
  • โ€ขVariance is the square of standard deviation (or SD is the square root of variance)

3. The Empirical Rule (68-95-99.7)

For data that follows a normal (bell-shaped) distribution, standard deviation has a powerful interpretation through the empirical rule. Approximately 68% of data falls within 1 standard deviation of the mean. Approximately 95% falls within 2 standard deviations. Approximately 99.7% falls within 3 standard deviations. This means if exam scores have a mean of 75 and SD of 10, about 68% of students scored between 65 and 85, about 95% scored between 55 and 95, and nearly all scored between 45 and 105. Any value more than 2-3 standard deviations from the mean is unusual โ€” this is the basis for identifying outliers and setting control limits.

Key Points

  • โ€ข68% of data within ยฑ1 SD, 95% within ยฑ2 SD, 99.7% within ยฑ3 SD (for normal distributions)
  • โ€ขValues beyond 2 SD from the mean are unusual; beyond 3 SD is rare
  • โ€ขThis rule only applies to approximately normal (bell-shaped) distributions

4. Why Standard Deviation Matters

Standard deviation is the foundation for most of inferential statistics. Confidence intervals are built using the standard error, which is the standard deviation divided by the square root of the sample size. Hypothesis tests use standard deviation to determine how many standard errors a sample statistic is from the null hypothesis value. Z-scores express individual values as the number of standard deviations from the mean. In data analysis, standard deviation tells you how much variability to expect and whether specific values are unusual. StatsIQ can walk you through standard deviation calculations from your homework, showing each step and explaining how the result connects to broader statistical concepts like z-scores and confidence intervals.

Key Points

  • โ€ขStandard error = SD / โˆšn โ€” the foundation of confidence intervals and hypothesis tests
  • โ€ขZ-scores express values as the number of SDs from the mean: z = (x - mean) / SD
  • โ€ขUnderstanding SD is prerequisite for inference, regression, and virtually every advanced topic

Key Takeaways

  • โ˜…Standard deviation is always non-negative โ€” it equals zero only when all data values are identical
  • โ˜…Adding a constant to all data values does not change the standard deviation (it only shifts the mean)
  • โ˜…Multiplying all data values by a constant multiplies the standard deviation by the absolute value of that constant
  • โ˜…The sample standard deviation (dividing by n-1) is an unbiased estimator of the population standard deviation
  • โ˜…Standard deviation is more sensitive to outliers than other measures of spread like IQR because it uses squared deviations

Practice Questions

1. Data set: {10, 12, 14, 16, 18}. Calculate the sample standard deviation.
Mean = 14. Deviations: -4, -2, 0, 2, 4. Squared deviations: 16, 4, 0, 4, 16. Sum = 40. Divide by n-1 = 4: variance = 10. Standard deviation = โˆš10 โ‰ˆ 3.16.
2. Exam scores are normally distributed with mean 80 and SD 6. What percentage of students scored between 68 and 92?
68 is 2 SDs below the mean (80 - 12), and 92 is 2 SDs above (80 + 12). By the empirical rule, approximately 95% of students scored in this range.
3. A student scored 95 on an exam with mean 80 and SD 5. What is their z-score and is it unusual?
z = (95 - 80) / 5 = 3.0. A z-score of 3 means the student scored 3 standard deviations above the mean. This is unusual โ€” only about 0.15% of students in a normal distribution score this high.

Study with AI

Get personalized help and instant answers anytime.

Download StatsIQ

FAQs

Common questions about this topic

Dividing by n-1 (called Bessel's correction) compensates for the fact that a sample mean is calculated from the same data, which slightly underestimates the true spread. The n-1 adjustment makes the sample variance an unbiased estimator of the population variance. For large samples, the difference is negligible.

Use standard deviation when the data is approximately symmetric and without extreme outliers. Use IQR (interquartile range) when the data is skewed or has outliers, because IQR is resistant to extreme values while standard deviation is heavily influenced by them.

Related Study Guides

Browse All Study Guides

๐ŸŽฏ AP Statistics๐Ÿ”ฌ Introduction to๐Ÿ“ˆ Regression Analysis๐ŸŽฒ Probability Foundations๐Ÿ“Š Understanding Statistical๐Ÿงช ANOVA and๐Ÿ“‰ Data Visualization๐Ÿ”„ Bayesian vs๐Ÿ“Š What Is๐Ÿ“ What Is๐Ÿ”— Correlation vs๐Ÿ“ Central Limit๐Ÿ“ Confidence Intervals:๐Ÿ“ P-Values and๐Ÿ“ Chi-Square Testsโš ๏ธ Type I๐ŸŽฒ Sampling Methods๐Ÿ“ˆ Introduction to๐Ÿ“ Effect Size๐Ÿ“‰ Multiple Regression:๐Ÿ”€ Non-Parametric Tests:๐ŸŽฏ How to๐Ÿงช A/B Testing๐Ÿงน Data Cleaningโฑ๏ธ Survival Analysis:๐Ÿ”— Introduction to๐Ÿ“ˆ Time Series๐Ÿ”ฌ Principal Component๐Ÿ”€ How to๐Ÿ“ Two-Sample t-Test๐Ÿ“Š How to๐Ÿ”€ Paired vs๐Ÿ“‹ How to๐Ÿ“Š Z-Scores and๐Ÿ“ˆ R Squared๐ŸŽฒ Binomial Probability๐ŸŽฒ Expected Value๐Ÿ“ Standard Error๐ŸŽฏ Margin of๐Ÿ“Š Contingency Tables๐Ÿ“‰ Poisson Distribution:๐Ÿ“ Cohen's d๐Ÿ”— Pearson vsโš–๏ธ One-Tailed vs๐Ÿ”” Normal Distribution๐Ÿ“‰ Linear Regression๐Ÿ“Š Mean vs๐ŸŽฏ Confidence vs๐Ÿ“Š Two-Way ANOVA:โšก Statistical Power๐ŸŽฏ Conditional Probability๐ŸŽฒ Permutations vs๐Ÿ“ˆ Log Transformations๐Ÿ”„ Simpson's Paradox:๐Ÿงช Hypothesis Testing:๐ŸŽฒ Probability Distributions:๐Ÿ“ˆ Central Limitโš–๏ธ Type I๐ŸŽฏ P-Value Interpretation:โ†”๏ธ One-Tailed vs๐ŸŽฒ Binomial vs๐Ÿ“Š Normal Distribution๐Ÿ“ˆ Discrete vs๐Ÿ“Š Chi-Square Goodness-of-Fit๐Ÿ”ฌ Mann-Whitney Uโฑ๏ธ Exponential Distribution:๐ŸŽฏ Geometric vs๐ŸŽฏ Wilcoxon Signed-Rank๐ŸŽฏ Kruskal-Wallis Test๐ŸŽฏ Tukey HSD๐ŸŽฏ Relative Risk๐Ÿ” Friedman Test๐Ÿ“ˆ Spearman vs๐ŸŽš๏ธ Bonferroni vs๐ŸŽฏ Confidence vsโšก A-Priori vs