🎯regression

R-Squared (Coefficient of Determination)

R² = 1 - SS_res / SS_tot

The coefficient of determination (R-squared) measures the proportion of variance in the dependent variable that is explained by the independent variable(s) in a regression model. It ranges from 0 to 1, where 1 means the model perfectly explains all variability and 0 means it explains none.

Variables

R²=Coefficient of Determination

The proportion of variance explained by the model, between 0 and 1

SS_res=Residual Sum of Squares

The sum of squared differences between observed and predicted values: Σ(yᵢ - ŷᵢ)²

SS_tot=Total Sum of Squares

The sum of squared differences between observed values and the mean: Σ(yᵢ - ȳ)²

Example Calculation

Scenario

A regression model predicting house prices has SS_tot = 500,000 and SS_res = 75,000. What proportion of variance does the model explain?

Given Data

SS_tot:500,000

SS_res:75,000

Calculation

R² = 1 - SS_res/SS_tot = 1 - 75,000/500,000 = 1 - 0.15

Result

R² = 0.85

Interpretation

The model explains 85% of the variability in house prices. The remaining 15% is due to factors not included in the model or random variation. This indicates a strong model fit.

When to Use This Formula

✓Evaluating how well a regression model fits the data
✓Comparing the explanatory power of different regression models
✓Communicating model performance in a simple, interpretable metric

Common Mistakes

✗Assuming a high R² means the model is correct or that causation has been established
✗Not using adjusted R² when comparing models with different numbers of predictors
✗Interpreting R² as the correlation coefficient (r is the square root of R² in simple regression)
✗Ignoring that R² always increases when more predictors are added, even if they are irrelevant

Calculate This Formula Instantly

Snap a photo of any problem and get step-by-step solutions.

Download StatsIQ

FAQs

Common questions about this formula

R² always increases when more predictors are added, regardless of whether they truly improve the model. Adjusted R² penalizes for the number of predictors: it only increases if a new predictor improves the model more than would be expected by chance. Always use adjusted R² when comparing models with different numbers of predictors.

There is no universal threshold. In physical sciences, R² above 0.9 is common. In social sciences, R² of 0.3 to 0.5 may be considered good. What matters is the context and whether the model serves its practical purpose. A low R² can still indicate a statistically significant and meaningful relationship.

Related Formulas

🔗 regression

Browse All Formulas

📊 Sample Mean 📏 Sample Standard Deviation 🎯 Z-Score 🔒 Confidence Interval for the Mean 🧪 One-Sample T-Test Statistic 🔢 Chi-Square Statistic 🔗 Pearson Correlation 📈 Linear Regression Slope 🎲 Binomial Probability 🔔 Normal Distribution PDF 🔄 Bayes' Theorem 📐 Margin of Error 📋 F-Statistic Formula for ANOVA: F = MSB / MSW (Worked Examples)🎯 R-Squared (Coefficient of Determination)🎰 Poisson Probability