Understanding Multicollinearity in Regression and How to Deal With It

In this article we will learn about what is multicollinearity in regression? and how to handle it.

Introduction – What is Multicollinearity?

Multicollinearity is a condition in regression analysis, where two or more independent variables in the regression model are having higher mutual correlation. In other words, multicollinearity occurs when there exist a linear relationship between two more variables of the regression model.

“Multicollinearity is a condition that arises when two or more independent variables in a multiple regression model are highly correlated with each other, such that it becomes difficult to disentangle the effects of the different independent variables on the dependent variable.” (Kutner, Nachtsheim, Neter, & Li, 2005, p. 198)

“Multicollinearity refers to the situation in which two or more predictor variables in a multiple regression model are highly correlated, meaning that they are measuring similar aspects of the same underlying phenomenon.” (Tabachnick & Fidell, 2013, p. 75)

The multicollinearity makes it difficult to interpret as well quantify the relationship between the independant and dependant variables.

Types of Multicollinearity

Examples of Multicollinearity.

1. In an employee dataset, where the objective is to predict the salary. Age of employee, qualification, and the experience can act as independant variables. In general with age, expereience also increases therefore generally possess higher mutual correlation which can be considered as multicollinearity condition.

2. Another example is, in climate studies one may be interested in global temperature changes. For this purpose, different factors considered can be deforestation rate, carbon emissions, population, automobile population, etc.  In these factors, human population, automobile population, and carbon emissions can have higher correlation leading to multicollinearity.

Why Multicollinearity Needs to be Addressed?

How to Detect Mlticollinearity?

How to Handle Multicollinearity?

Summary

Further Reading

Leave a Comment