What is considered an acceptable R-squared value in multivariate regression to determine the absence of multicollinearity among predictors?👇🏾👇 🏾
In multivariate regression, the R-squared value itself does not directly determine the absence of multicollinearity among predictors. Instead, multicollinearity is typically assessed using other statistical measures. However, understanding the context in which you're asking about R-squared is crucial, so let's clarify a few points.
Understanding Multicollinearity
Multicollinearity occurs when two or more predictors in a regression model are highly correlated, meaning they provide redundant information about the variance in the dependent variable. This can lead to inflated standard errors and unstable estimates of coefficients.
Measures to Detect Multicollinearity
- Variance Inflation Factor (VIF): This is one of the most common measures. A VIF value greater than 10 (some use 5 as a threshold) suggests significant multicollinearity.
- Tolerance: This is the reciprocal of VIF. A tolerance value less than 0.1 (or sometimes 0.2) indicates multicollinearity.
- Condition Index: Values above 15 suggest moderate multicollinearity, and above 30 suggest severe multicollinearity.
R-squared and Multicollinearity
R-squared values can indirectly provide insight into multicollinearity in certain contexts:
- High R-squared in predictor regressions: If you regress one predictor on all other predictors and find a high R-squared value (close to 1), it indicates that the predictor is highly collinear with the others.
Interpreting Acceptable R-squared Values
There isn't a specific "acceptable" R-squared value that universally determines the absence of multicollinearity, but in practice, you can consider the following approach:
- Auxiliary Regression Analysis: Perform an auxiliary regression where each predictor is regressed on the remaining predictors. If the R-squared value of any of these regressions is very high (close to 1), it suggests multicollinearity.
Practical Guidelines
- VIF < 5-10: This is generally used to judge the absence of significant multicollinearity.
- R-squared in Predictor Regressions < 0.8-0.9: High R-squared values in auxiliary regressions indicate potential multicollinearity.
Example
If you have predictors , and regress on and and find an R-squared of 0.95, this suggests multicollinearity. Similarly, high R-squared values for regressions of on and or on and would also suggest multicollinearity.
Conclusion
While the R-squared value in the context of auxiliary regressions can provide clues about multicollinearity, it is not a standalone measure. Instead, rely on VIF, tolerance, and condition index for a more direct and reliable assessment of multicollinearity.