🎯regression

Coefficient of Determination

R² = 1 - SS_res / SS_tot

The coefficient of determination (R-squared) measures the proportion of variance in the dependent variable that is explained by the independent variable(s) in a regression model. It ranges from 0 to 1, where 1 means the model perfectly explains all variability and 0 means it explains none.

Variables

=Coefficient of Determination

The proportion of variance explained by the model, between 0 and 1

SS_res=Residual Sum of Squares

The sum of squared differences between observed and predicted values: Σ(yᵢ - ŷᵢ)²

SS_tot=Total Sum of Squares

The sum of squared differences between observed values and the mean: Σ(yᵢ - ȳ)²

Example Calculation

Scenario

A regression model predicting house prices has SS_tot = 500,000 and SS_res = 75,000. What proportion of variance does the model explain?

Given Data

SS_tot:500,000
SS_res:75,000

Calculation

R² = 1 - SS_res/SS_tot = 1 - 75,000/500,000 = 1 - 0.15

Result

R² = 0.85

Interpretation

The model explains 85% of the variability in house prices. The remaining 15% is due to factors not included in the model or random variation. This indicates a strong model fit.

When to Use This Formula

  • Evaluating how well a regression model fits the data
  • Comparing the explanatory power of different regression models
  • Communicating model performance in a simple, interpretable metric

Common Mistakes

  • Assuming a high R² means the model is correct or that causation has been established
  • Not using adjusted R² when comparing models with different numbers of predictors
  • Interpreting R² as the correlation coefficient (r is the square root of R² in simple regression)
  • Ignoring that R² always increases when more predictors are added, even if they are irrelevant

Calculate This Formula Instantly

Snap a photo of any problem and get step-by-step solutions.

Download StatsIQ

FAQs

Common questions about this formula

R² always increases when more predictors are added, regardless of whether they truly improve the model. Adjusted R² penalizes for the number of predictors: it only increases if a new predictor improves the model more than would be expected by chance. Always use adjusted R² when comparing models with different numbers of predictors.

There is no universal threshold. In physical sciences, R² above 0.9 is common. In social sciences, R² of 0.3 to 0.5 may be considered good. What matters is the context and whether the model serves its practical purpose. A low R² can still indicate a statistically significant and meaningful relationship.

More Formulas