Multiple Linear Regression

Almost all the real-world problems that we encounter have more than two variables. Linear regression involving multiple variables is called "multiple linear regression"(MLR). The steps to perform multiple linear regression are almost similar to that of simple linear regression. The difference lies in the evaluation. You can use it to find out which factor has the highest impact on the predicted output and how different variables relate to each other.

MLR was always easy to catch up on because it builds upon the straight line concept only this time it is more suited to real-world datasets. I wanted to use a fresh dataset for it. So I start looking for any unclean datasets, hoping to go through all the steps I have learned so far. I stumble upon the world bank dataset and with some hit and trials, I download the dataset I wanted to work with.

The first thing I notice is that there are a lot of variables present in the dataset. Working with pre-defined datasets, it is easy to identify the dependent and the independent variables. When you have a fresh dataset, you pretty much don't know where to start. I was very overwhelming to start working with it. I start by importing the data to a pandas dataframe and run the head() command. I was again overwhelmed by the results. I get flustered and go back to the csv to manually start cleaning it. I was thinking that if I manually start cleaning it, I will have a better idea of how to do in python.

But the sheer amount of data proved to be very tedious to work with. Next, I go back to the website again and this time sort the dataset more to get a smaller and less complicated dataset. The smaller dataset which had only half a year's data proved to be a little less daunting. This took a couple of tries for me and multiple days to achieve.

But this experimentation was taking longer than expected. I wanted to move forward in the module so I make the decision of tabling the idea for now and to work with the provided dataset so that I at least get to practice.


Comments

Popular posts from this blog

Linear Regression Theory