Coefficient of Determination
R² = 1 - SS_res / SS_tot
The coefficient of determination (R-squared) measures the proportion of variance in the dependent variable that is explained by the independent variable(s) in a regression model. It ranges from 0 to 1, where 1 means the model perfectly explains all variability and 0 means it explains none.
Variables
The proportion of variance explained by the model, between 0 and 1
The sum of squared differences between observed and predicted values: Σ(yᵢ - ŷᵢ)²
The sum of squared differences between observed values and the mean: Σ(yᵢ - ȳ)²
Example Calculation
Scenario
A regression model predicting house prices has SS_tot = 500,000 and SS_res = 75,000. What proportion of variance does the model explain?
Given Data
Calculation
R² = 1 - SS_res/SS_tot = 1 - 75,000/500,000 = 1 - 0.15
Result
R² = 0.85
Interpretation
The model explains 85% of the variability in house prices. The remaining 15% is due to factors not included in the model or random variation. This indicates a strong model fit.
When to Use This Formula
- ✓Evaluating how well a regression model fits the data
- ✓Comparing the explanatory power of different regression models
- ✓Communicating model performance in a simple, interpretable metric
Common Mistakes
- ✗Assuming a high R² means the model is correct or that causation has been established
- ✗Not using adjusted R² when comparing models with different numbers of predictors
- ✗Interpreting R² as the correlation coefficient (r is the square root of R² in simple regression)
- ✗Ignoring that R² always increases when more predictors are added, even if they are irrelevant
Calculate This Formula Instantly
Snap a photo of any problem and get step-by-step solutions.
Download StatsIQFAQs
Common questions about this formula
R² always increases when more predictors are added, regardless of whether they truly improve the model. Adjusted R² penalizes for the number of predictors: it only increases if a new predictor improves the model more than would be expected by chance. Always use adjusted R² when comparing models with different numbers of predictors.
There is no universal threshold. In physical sciences, R² above 0.9 is common. In social sciences, R² of 0.3 to 0.5 may be considered good. What matters is the context and whether the model serves its practical purpose. A low R² can still indicate a statistically significant and meaningful relationship.