The Line of Best Fit
The equation of a line
From school mathematics you may recall the equation of a straight line:
where is the slope and is the y-intercept. In machine learning notation, the same equation is written:
- (pronounced "y-hat") is the predicted value.
- is the input feature.
- (theta-zero) is the intercept — the value of when .
- (theta-one) is the slope — how much changes for a one-unit increase in .
The two values and are called the model parameters or weights. Training a linear regression model means finding the values of these parameters that make the line fit the data as well as possible.
What the slope and intercept mean
Consider a model that predicts house price (in thousands of dollars) from size (in square feet):
- Intercept : a house with zero square feet would be predicted to cost $50,000. (This may not be physically meaningful, but it anchors the line.)
- Slope : each additional square foot adds $150 to the predicted price.
Predictions
Given a trained model, making a prediction is just arithmetic. For a 1,200 sq ft house:
Predicted price: $230,000.
Residuals
No line fits noisy real-world data perfectly. The difference between the actual value and the predicted value for training example is called the residual:
A positive residual means the model under-predicted; a negative residual means it over-predicted. The goal of training is to find parameters that make these residuals collectively as small as possible.
Many lines are possible
For any dataset there are infinitely many lines you could draw. The question is: which line is best? Different answers to that question lead to different learning algorithms. The most common answer — minimize the sum of squared residuals — is called Ordinary Least Squares, covered in lesson 4.
Notation summary
| Symbol | Meaning |
|---|---|
| Feature value of the -th training example | |
| True target value of the -th training example | |
| Predicted value for the -th example | |
| Number of training examples | |
| Intercept parameter | |
| Slope parameter |
This notation will be used throughout the rest of the course.