๐Ÿ“
fundamentalsbeginner20 min

What Is Standard Deviation and How Do You Calculate It?

Learn what standard deviation measures, how to calculate it by hand and interpret it, and why it is the most important measure of spread in statistics.

What You'll Learn

  • โœ“Define standard deviation and explain what it measures
  • โœ“Calculate standard deviation by hand using the formula
  • โœ“Interpret standard deviation in context
  • โœ“Understand the relationship between standard deviation, variance, and the normal distribution

1. What Standard Deviation Measures

Standard deviation measures how spread out data values are from the mean. A small standard deviation means the data points cluster tightly around the mean โ€” the values are consistent. A large standard deviation means the data points are widely scattered โ€” the values vary a lot. It is the most commonly used measure of variability in statistics because it is expressed in the same units as the original data, making it directly interpretable. If the mean exam score is 75 with a standard deviation of 5, most scores fall within a few points of 75. If the standard deviation is 15, scores are much more spread out.

Key Points

  • โ€ขStandard deviation measures how far data points typically fall from the mean
  • โ€ขSmall SD = data clustered near the mean; large SD = data spread widely
  • โ€ขSD is in the same units as the data, unlike variance which is in squared units

2. The Formula and Step-by-Step Calculation

For a population, standard deviation (ฯƒ) = โˆš[ฮฃ(xi - ฮผ)ยฒ / N]. For a sample, standard deviation (s) = โˆš[ฮฃ(xi - xฬ„)ยฒ / (n-1)]. The sample formula divides by n-1 instead of N to correct for bias when estimating a population parameter from a sample. Step-by-step: (1) Calculate the mean. (2) Subtract the mean from each data point to get deviations. (3) Square each deviation. (4) Sum the squared deviations. (5) Divide by N (population) or n-1 (sample). (6) Take the square root. Example with data {2, 4, 6, 8, 10}: Mean = 6. Deviations: -4, -2, 0, 2, 4. Squared: 16, 4, 0, 4, 16. Sum = 40. For a sample: 40/4 = 10. Square root: s โ‰ˆ 3.16.

Key Points

  • โ€ขPopulation SD divides by N; sample SD divides by n-1 (Bessel correction)
  • โ€ขSteps: mean โ†’ deviations โ†’ square โ†’ sum โ†’ divide โ†’ square root
  • โ€ขVariance is the square of standard deviation (or SD is the square root of variance)

3. The Empirical Rule (68-95-99.7)

For data that follows a normal (bell-shaped) distribution, standard deviation has a powerful interpretation through the empirical rule. Approximately 68% of data falls within 1 standard deviation of the mean. Approximately 95% falls within 2 standard deviations. Approximately 99.7% falls within 3 standard deviations. This means if exam scores have a mean of 75 and SD of 10, about 68% of students scored between 65 and 85, about 95% scored between 55 and 95, and nearly all scored between 45 and 105. Any value more than 2-3 standard deviations from the mean is unusual โ€” this is the basis for identifying outliers and setting control limits.

Key Points

  • โ€ข68% of data within ยฑ1 SD, 95% within ยฑ2 SD, 99.7% within ยฑ3 SD (for normal distributions)
  • โ€ขValues beyond 2 SD from the mean are unusual; beyond 3 SD is rare
  • โ€ขThis rule only applies to approximately normal (bell-shaped) distributions

4. Why Standard Deviation Matters

Standard deviation is the foundation for most of inferential statistics. Confidence intervals are built using the standard error, which is the standard deviation divided by the square root of the sample size. Hypothesis tests use standard deviation to determine how many standard errors a sample statistic is from the null hypothesis value. Z-scores express individual values as the number of standard deviations from the mean. In data analysis, standard deviation tells you how much variability to expect and whether specific values are unusual. StatsIQ can walk you through standard deviation calculations from your homework, showing each step and explaining how the result connects to broader statistical concepts like z-scores and confidence intervals.

Key Points

  • โ€ขStandard error = SD / โˆšn โ€” the foundation of confidence intervals and hypothesis tests
  • โ€ขZ-scores express values as the number of SDs from the mean: z = (x - mean) / SD
  • โ€ขUnderstanding SD is prerequisite for inference, regression, and virtually every advanced topic

Key Takeaways

  • โ˜…Standard deviation is always non-negative โ€” it equals zero only when all data values are identical
  • โ˜…Adding a constant to all data values does not change the standard deviation (it only shifts the mean)
  • โ˜…Multiplying all data values by a constant multiplies the standard deviation by the absolute value of that constant
  • โ˜…The sample standard deviation (dividing by n-1) is an unbiased estimator of the population standard deviation
  • โ˜…Standard deviation is more sensitive to outliers than other measures of spread like IQR because it uses squared deviations

Practice Questions

1. Data set: {10, 12, 14, 16, 18}. Calculate the sample standard deviation.
Mean = 14. Deviations: -4, -2, 0, 2, 4. Squared deviations: 16, 4, 0, 4, 16. Sum = 40. Divide by n-1 = 4: variance = 10. Standard deviation = โˆš10 โ‰ˆ 3.16.
2. Exam scores are normally distributed with mean 80 and SD 6. What percentage of students scored between 68 and 92?
68 is 2 SDs below the mean (80 - 12), and 92 is 2 SDs above (80 + 12). By the empirical rule, approximately 95% of students scored in this range.
3. A student scored 95 on an exam with mean 80 and SD 5. What is their z-score and is it unusual?
z = (95 - 80) / 5 = 3.0. A z-score of 3 means the student scored 3 standard deviations above the mean. This is unusual โ€” only about 0.15% of students in a normal distribution score this high.

Study with AI

Get personalized help and instant answers anytime.

Download StatsIQ

FAQs

Common questions about this topic

Dividing by n-1 (called Bessel's correction) compensates for the fact that a sample mean is calculated from the same data, which slightly underestimates the true spread. The n-1 adjustment makes the sample variance an unbiased estimator of the population variance. For large samples, the difference is negligible.

Use standard deviation when the data is approximately symmetric and without extreme outliers. Use IQR (interquartile range) when the data is skewed or has outliers, because IQR is resistant to extreme values while standard deviation is heavily influenced by them.

More Study Guides