Linear Regression Theory

The term "linearity" in algebra refers to a linear relationship between two or more variables. If we draw this relationship in a two-dimensional space (between two variables, in this case), we get a straight line.

Linear regression assumes a linear or straight line relationship between the input variables (X) and the single output variable (y).

More specifically, that output (y) can be calculated from a linear combination of the input variables (X). When there is a single input variable, the method is referred to as a simple linear regression. In simple linear regression, we can use statistics on the training data to estimate the coefficients required by the model to make predictions on new data.

The line for a simple linear regression model can be written as:

y = b0 + b1 * x

where b0 and b1 are the coefficients we must estimate from the training data.
This reminds me of my high school geometry lessons. It's funny how this points out that when we write the equation of a line we are essentially predicting all the points that can lie on that line. 


There are a few steps to implement a linear regression model:

  1. Calculate Mean and Variance
  2. Calculate Covariance
  3. Estimate Coefficients
  4. Make Predictions
  5. Predict Value
Calculating the mean and variance helps determine the spread of the data. This helps in particularly to asses if our predictions are in sync with the training data. The predictions usually lie within the first or the second standard deviation of the mean. I find myself falling back on my previous statistical course multiple times during this module. That helped me understand why we were doing all these pre-flight steps before starting with the model. Why we cannot just put in some estimated coefficients and how those values were determined.


The key thing that I realized in this module was just knowing the pre-flight steps isn't enough and a directed online course may have its own caveats.

Comments

Popular posts from this blog

Finally built a text classifier - Part 1