Deriving OLS Estimates for a Simple Regression Model

Naman Agrawal
4 min readAug 4, 2020

The Simple Regression:

In Econometrics, a simple regression is a tool used to establish a relationship between 2 variables. One of the variables (Y) is called the dependent variable or the regressand, while the other variable (X) is called the independent variable or the regressor. Mathematically, a simple regression model is expressed as:

Here α and β are the regression coefficients i.e. the parameters that need to be calculated to understand the relation between Y and X. i has been subscripted along with X and Y to indicate that we are referring to a particular observation, a particular value associated with X and Y. εᵢ is the error term associated with each observation i.

Estimating α and β:

So, how do we estimate α and β? One of the most common approach used by statisticians is the OLS approach. OLS stands for Ordinary Least Squares. Under this method, we try to find a linear function that minimizes the sum of the squares of the difference between the true value of Y and the predicted value of Y. Let the true value of Y associated with each observation be Yi and the value predicted by our model be α +βXᵢ. So, we are essentially trying to minimise:

where i, goes from 1 to n, indicates that we have n observations in our dataset for values of X and Y.

Graphs can be more intuitive:

Derivation:

So, now that we know what OLS is and what it attempts to do, we can begin our derivation for estimates of α and β.

Step 1: Defining the OLS function

OLS, as described earlier is a function of α and β. So our function can be expressed as:

Step 2: Minimizing our function by taking partial derivatives and equating them to zero.

First, we take the partial derivative of f(α, β) with respect to α, and equate the derivative to zero to minimize the function over α.

Equation 1

Note: We have replaced α and β with α-hat and β-hat to indicate that we are finding an estimate for the regression coefficients.

Similarly, we take the partial derivative of f(α, β) with respect to β, and equate the derivative to zero to minimize the function over β.

Equation 2

Step 3: We solve equation 1 In order to obtain a relation between α-hat and β-hat

From Equation 1,

Dividing by n (number of observations) on both sides of the equation:

On splitting up the expression and solving further:

Equation 3

Now, we define X̅ and Y̅ as the means of the observations under X and Y from our dataset. Mathematically:

Using these in Equation 3:

Equation 4

Step 4: We solve for equation 2 using results from equation 1 and 4 to get an estimate for β-hat.

First, we multiply equation 1 by X̅:

Subtracting this from equation 2:

Using equation 4,

Substituting the value of α-hat in the previous equation:

This is the required expression for estimating β-hat.

To obtain the expression for calculating α-hat, we substitute the expression for β-hat in equation 4:

Thus, we have derived the OLS estimators.

Why OLS?

We have seen that OLS estimators are calculated by taking the sum of the squares of the difference between the true value and estimated value of the regressand. We could also add up the absolute value of the difference, rather than adding the squares. The estimator so formed is called the least absolute deviation estimator. There are other estimators as well, like the reverse least squares estimator. So, why are we choosing OLS?

There are multiple reasons for our decision. The foremost reason is that OLS estimators are unbiased. Moreover, they are highly efficient (low variance) and provide consistent results.

--

--

Naman Agrawal

Data Science and Analytics @ NUS; Economics Enthusiast