📈
advancedintermediate18-22 min

Spearman vs Pearson Correlation: When to Use Each

Pearson measures linear association on interval data; Spearman measures monotonic association using ranks. Here is exactly how each works, when to switch, and two fully worked examples — including the cases where r and rho disagree dramatically.

What You'll Learn

  • Choose between Pearson and Spearman based on scale, linearity, and outliers.
  • Compute Spearman’s rho from ranks by hand, including the tie correction.
  • Recognize cases where the two coefficients diverge and what that means.

1. Direct Answer: Pearson Linear, Spearman Monotonic

Use Pearson’s r when both variables are interval or ratio scaled, their relationship is approximately linear, and the joint distribution is roughly bivariate normal without extreme outliers — r quantifies the strength and direction of that linear association. Use Spearman’s rho (ρ) when either variable is ordinal, the relationship is monotonic but not linear, or outliers and skew distort Pearson — rho is just Pearson’s r applied to the RANKS of the data, so it measures how well a monotonic function (any function that consistently increases or decreases) describes the relationship. The trade-off: when assumptions hold, Pearson is more powerful and has a cleaner interpretation as the slope of standardized scores; when they fail, Pearson can give wildly misleading values (a single outlier can flip the sign), while Spearman barely moves. Always plot the data first — neither coefficient should be reported in isolation.

Key Points

  • Pearson r: linear association on interval data, assumes bivariate normality.
  • Spearman rho: monotonic association via ranks, robust to outliers and non-linearity.
  • Kendall’s tau is a third rank-based option, often preferred for small samples or many ties.

2. How Pearson’s r Is Computed

r = Σ[(x − x̄)(y − ȳ)] / √[Σ(x − x̄)² × Σ(y − ȳ)²]. The numerator is the sample covariance scaled by n − 1, and the denominator is the product of sample standard deviations scaled the same way. The result lies in [−1, 1]: −1 perfect negative linear, 0 no linear relationship, +1 perfect positive linear. Significance is tested with t = r × √[(n − 2) / (1 − r²)] on n − 2 degrees of freedom. Note that r² (the coefficient of determination) equals the proportion of variance in y explained by a linear regression on x, which is why people quote “r² = 0.36 means 36% of the variance is explained” — but only if linearity actually holds.

Key Points

  • r = covariance(x,y) / [SD(x) × SD(y)], with range [−1, 1].
  • t = r × √[(n − 2)/(1 − r²)] tests H₀: ρ = 0.
  • r² = proportion of variance explained by a linear fit — interpretation depends on linearity.

3. How Spearman’s Rho Is Computed

Convert each variable to its ranks (1 = smallest, n = largest; average ranks for ties), then compute Pearson’s r on the ranks. With no ties, a simpler formula works: rho = 1 − [6 × Σd²] / [n(n² − 1)], where d is the rank difference for each pair. With ties, use the rank-based Pearson formula directly or apply a tie correction to the simplified version. Significance: for small n consult a Spearman rho table; for n ≥ ~10 use t = rho × √[(n − 2)/(1 − rho²)]. The key conceptual move is that Spearman ignores actual magnitudes — only the ORDER of x and the ORDER of y matter — so any monotonic relationship (linear, log, quadratic-with-no-inflection) gets the same high score.

Key Points

  • Rank both variables, then run Pearson on the ranks.
  • Simplified (no ties): rho = 1 − 6Σd² / [n(n² − 1)].
  • Significance test: t = rho × √[(n − 2)/(1 − rho²)] with df = n − 2.

4. Worked Example 1: Height and Weight (Pearson Appropriate)

Ten adults: heights 160, 165, 168, 170, 172, 175, 178, 180, 183, 188 cm; weights 55, 60, 65, 68, 70, 75, 78, 82, 85, 92 kg. The scatterplot is approximately linear with no outliers. Mean height = 173.9, mean weight = 73.0. Σ(x − x̄)(y − ȳ) = 728.0; Σ(x − x̄)² = 696.9; Σ(y − ȳ)² = 1086.0. r = 728.0 / √(696.9 × 1086.0) = 728.0 / 870.0 = 0.837. t = 0.837 × √(8 / (1 − 0.700)) = 0.837 × 5.16 = 4.32 with df = 8 → p < 0.005. r² = 0.70, so a linear fit explains 70% of weight variance. Spearman’s rho on the same data is 0.844 — virtually identical because the data are clean and the relationship is essentially linear.

Key Points

  • Clean, linear, no outliers → Pearson is appropriate and informative.
  • r = 0.84, p < 0.005, r² = 0.70.
  • Spearman essentially matches when assumptions hold.

5. Worked Example 2: Customer Satisfaction Ratings (Spearman Wins)

Twelve customers rate price (1–5 ordinal) and overall satisfaction (1–10 ordinal). Two outliers — one customer rates price 1 and satisfaction 10 (huge bargain hunter), another rates price 5 and satisfaction 1 (sticker shock). Pearson r computed on the raw ordinal numbers is −0.32 with p = 0.31 — weak and not significant. But the relationship is clearly monotonic: as price goes up, satisfaction goes down across most customers. Rank both variables (averaging ties), then compute Spearman: rho = −0.71 with p = 0.010. Pearson missed it because (a) the data are ordinal, not interval — the difference between “2” and “3” is not the same as between “4” and “5” — and (b) the two extreme customers anchored Pearson’s numerator. Ranks neutralize both problems.

Key Points

  • Ordinal scales and outliers torpedo Pearson but barely move Spearman.
  • Pearson r = −0.32 (n.s.) → Spearman rho = −0.71 (p = 0.010).
  • Ranks neutralize extreme values by capping their influence at min/max rank.

6. When r and Rho Disagree, What Does It Mean

A larger Spearman rho than Pearson r usually means the relationship is monotonic but not linear (think log-shaped or saturation curves). A larger Pearson than Spearman is rare and usually points to a near-linear pattern with a few influential points that ranks dilute. A near-zero r with a clearly non-zero rho should send you to a scatterplot — you may have a non-monotonic relationship like an inverted U (height vs running speed across an age range), which neither coefficient handles. In that case fit a non-linear model or split the range. Kendall’s tau is a third option that quantifies the proportion of concordant minus discordant pairs; it usually runs lower than rho on the same data but has cleaner small-sample behavior and tighter interpretation.

Key Points

  • rho > r: monotonic but curved relationship.
  • r > rho: linear with a few influential points.
  • Near-zero r with non-zero rho: check for non-monotonic patterns first.

7. Running the Comparison in StatsIQ

Snap a photo of a paired-variable scatterplot or dataset and StatsIQ computes Pearson, Spearman, and Kendall coefficients side by side, runs the appropriate significance test for each, and flags the case where the three disagree enough that you should look at the scatterplot before reporting. It also handles the tie-corrected formulas automatically. This content is for educational purposes only.

Key Points

  • All three coefficients computed side by side with appropriate p-values.
  • Disagreement among the three is flagged so you re-examine the relationship.
  • Tie correction applied automatically when ranks are tied.

Key Takeaways

  • Pearson r assumes interval scales, bivariate normality, and linearity; sensitive to outliers.
  • Spearman rho = Pearson r applied to ranks; tests monotonic association.
  • Simplified Spearman (no ties): rho = 1 − 6Σd² / [n(n² − 1)].
  • r² is the proportion of variance explained ONLY if the relationship is linear.
  • Kendall’s tau is preferred for small n or many ties; usually lower than rho on same data.

Practice Questions

1. You compute r = 0.10 (p = 0.45) but rho = 0.60 (p = 0.002) on the same n = 30 pairs. What is the most likely explanation?
A monotonic but strongly non-linear relationship, possibly with curvature or saturation. Pearson missed it because it tests linearity; rho found it because it only requires monotonicity. Plot the data and fit a non-linear model (log, power, exponential) before reporting.
2. Your data are bivariate normal interval scales, n = 200. Which coefficient gives you the most power?
Pearson r. With clean assumptions and large n, Pearson is fully efficient and supplies a directly interpretable r² for variance explained. Reporting Spearman additionally is fine, but Pearson is the primary statistic.
3. Why might Kendall’s tau be preferred over Spearman in a paper with n = 12 ordinal observations?
Tau has better small-sample properties — its sampling distribution is closer to normal at small n, and it is more conservative with many ties. Tau also has a cleaner probabilistic interpretation (P(concordant) − P(discordant)) that some reviewers prefer.

Study with AI

Get personalized help and instant answers anytime.

Download StatsIQ

FAQs

Common questions about this topic

It is widespread but technically wrong because Likert scales are ordinal, not interval — the psychological distance from 1 to 2 is not the same as 4 to 5. Spearman or Kendall is the principled choice. In practice with n ≥ 30 and a 5- or 7-point scale, Pearson on Likert data usually gives similar conclusions, but reviewers in psychology and education journals often demand the rank-based version.

It handles MONOTONIC relationships — those that consistently increase or consistently decrease. A linear, log, exponential, or saturation curve are all monotonic, so Spearman will pick them up. An inverted U or any relationship that goes up then down is NOT monotonic, and Spearman will be near zero. For non-monotonic patterns use a scatterplot followed by a non-linear regression or piecewise model.

Report rho (or ρ_s) with the sample size, the exact p-value, and a 95% confidence interval. The CI is typically obtained via Fisher’s z transformation: z = 0.5 × ln[(1+ρ)/(1−ρ)], with SE = 1/√(n−3); back-transform the z bounds. Most software (R cor.test, Python scipy.stats.spearmanr) provides this. Always include a scatterplot alongside.

Both Pearson and Spearman measure ASSOCIATION between two variables in a sample. Neither can rule out confounding (a third variable driving both), reverse causation (Y → X rather than X → Y), or selection bias. To support a causal claim you need an experimental design (randomization), a well-identified natural experiment, or a causal inference method like instrumental variables. The choice between r and rho does not change this — both are descriptive.

Snap a photo of the paired data; StatsIQ runs all three coefficients (Pearson, Spearman, Kendall), checks linearity with a residual plot, screens for outliers and ordinal scales, and recommends which to report — usually Spearman or Kendall when assumptions fail. Effect sizes and CIs are computed in each case. This content is for educational purposes only.

Related Study Guides

Browse All Study Guides