Ordinary Least Squares
What is OLS?
Ordinary Least Squares (OLS) is the classical method for fitting a linear regression model. It finds the parameter values and that minimize the sum of squared residuals — exactly the MSE cost function from the previous lesson — by solving a system of equations analytically (with a formula) rather than iteratively.
Because it gives you the exact minimum in one step, OLS is often called a closed-form solution.
Deriving the OLS formulas
To minimize , take the partial derivatives with respect to each parameter, set them to zero, and solve.
Starting from:
Setting yields:
where and are the sample means.
Setting and substituting the expression for yields:
These two formulas together are the OLS estimators. Plug in your data, compute the means, and you immediately have the best-fit slope and intercept.
Worked example
Suppose you have four training examples:
| (size, sq ft) | (price, $000) |
|---|---|
| 1000 | 200 |
| 1500 | 260 |
| 2000 | 330 |
| 2500 | 380 |
Step 1 — compute means:
Step 2 — compute :
Numerator:
Denominator:
Step 3 — compute :
Result:
Properties of OLS estimators
Under standard assumptions (covered in the Assumptions lesson), OLS estimators have important statistical guarantees known as the Gauss–Markov theorem: they are the Best Linear Unbiased Estimators (BLUE). This means:
- Unbiased: on average, equals the true slope.
- Minimum variance: no other linear unbiased estimator has lower variance.
Limitations of the simple formulas
The formulas above work perfectly for simple linear regression (one feature). When you have multiple features, OLS generalizes to the Normal Equation in matrix form — covered in the Multiple Features lesson.
Key takeaway
OLS gives you the exact best-fit parameters in one calculation. There is no approximation, no iteration, and no learning rate to tune. For small to medium datasets with a modest number of features, it is the preferred approach.