What Is Standard Deviation and How Do You Calculate It?
Learn what standard deviation measures, how to calculate it by hand and interpret it, and why it is the most important measure of spread in statistics.
What You'll Learn
- โDefine standard deviation and explain what it measures
- โCalculate standard deviation by hand using the formula
- โInterpret standard deviation in context
- โUnderstand the relationship between standard deviation, variance, and the normal distribution
1. What Standard Deviation Measures
Standard deviation measures how spread out data values are from the mean. A small standard deviation means the data points cluster tightly around the mean โ the values are consistent. A large standard deviation means the data points are widely scattered โ the values vary a lot. It is the most commonly used measure of variability in statistics because it is expressed in the same units as the original data, making it directly interpretable. If the mean exam score is 75 with a standard deviation of 5, most scores fall within a few points of 75. If the standard deviation is 15, scores are much more spread out.
Key Points
- โขStandard deviation measures how far data points typically fall from the mean
- โขSmall SD = data clustered near the mean; large SD = data spread widely
- โขSD is in the same units as the data, unlike variance which is in squared units
2. The Formula and Step-by-Step Calculation
For a population, standard deviation (ฯ) = โ[ฮฃ(xi - ฮผ)ยฒ / N]. For a sample, standard deviation (s) = โ[ฮฃ(xi - xฬ)ยฒ / (n-1)]. The sample formula divides by n-1 instead of N to correct for bias when estimating a population parameter from a sample. Step-by-step: (1) Calculate the mean. (2) Subtract the mean from each data point to get deviations. (3) Square each deviation. (4) Sum the squared deviations. (5) Divide by N (population) or n-1 (sample). (6) Take the square root. Example with data {2, 4, 6, 8, 10}: Mean = 6. Deviations: -4, -2, 0, 2, 4. Squared: 16, 4, 0, 4, 16. Sum = 40. For a sample: 40/4 = 10. Square root: s โ 3.16.
Key Points
- โขPopulation SD divides by N; sample SD divides by n-1 (Bessel correction)
- โขSteps: mean โ deviations โ square โ sum โ divide โ square root
- โขVariance is the square of standard deviation (or SD is the square root of variance)
3. The Empirical Rule (68-95-99.7)
For data that follows a normal (bell-shaped) distribution, standard deviation has a powerful interpretation through the empirical rule. Approximately 68% of data falls within 1 standard deviation of the mean. Approximately 95% falls within 2 standard deviations. Approximately 99.7% falls within 3 standard deviations. This means if exam scores have a mean of 75 and SD of 10, about 68% of students scored between 65 and 85, about 95% scored between 55 and 95, and nearly all scored between 45 and 105. Any value more than 2-3 standard deviations from the mean is unusual โ this is the basis for identifying outliers and setting control limits.
Key Points
- โข68% of data within ยฑ1 SD, 95% within ยฑ2 SD, 99.7% within ยฑ3 SD (for normal distributions)
- โขValues beyond 2 SD from the mean are unusual; beyond 3 SD is rare
- โขThis rule only applies to approximately normal (bell-shaped) distributions
4. Why Standard Deviation Matters
Standard deviation is the foundation for most of inferential statistics. Confidence intervals are built using the standard error, which is the standard deviation divided by the square root of the sample size. Hypothesis tests use standard deviation to determine how many standard errors a sample statistic is from the null hypothesis value. Z-scores express individual values as the number of standard deviations from the mean. In data analysis, standard deviation tells you how much variability to expect and whether specific values are unusual. StatsIQ can walk you through standard deviation calculations from your homework, showing each step and explaining how the result connects to broader statistical concepts like z-scores and confidence intervals.
Key Points
- โขStandard error = SD / โn โ the foundation of confidence intervals and hypothesis tests
- โขZ-scores express values as the number of SDs from the mean: z = (x - mean) / SD
- โขUnderstanding SD is prerequisite for inference, regression, and virtually every advanced topic
Key Takeaways
- โ Standard deviation is always non-negative โ it equals zero only when all data values are identical
- โ Adding a constant to all data values does not change the standard deviation (it only shifts the mean)
- โ Multiplying all data values by a constant multiplies the standard deviation by the absolute value of that constant
- โ The sample standard deviation (dividing by n-1) is an unbiased estimator of the population standard deviation
- โ Standard deviation is more sensitive to outliers than other measures of spread like IQR because it uses squared deviations
Practice Questions
1. Data set: {10, 12, 14, 16, 18}. Calculate the sample standard deviation.
2. Exam scores are normally distributed with mean 80 and SD 6. What percentage of students scored between 68 and 92?
3. A student scored 95 on an exam with mean 80 and SD 5. What is their z-score and is it unusual?
FAQs
Common questions about this topic
Dividing by n-1 (called Bessel's correction) compensates for the fact that a sample mean is calculated from the same data, which slightly underestimates the true spread. The n-1 adjustment makes the sample variance an unbiased estimator of the population variance. For large samples, the difference is negligible.
Use standard deviation when the data is approximately symmetric and without extreme outliers. Use IQR (interquartile range) when the data is skewed or has outliers, because IQR is resistant to extreme values while standard deviation is heavily influenced by them.