What Is Logistic Regression?
From regression to classification
In the linear regression course, the goal was to predict a continuous number — a house price, a temperature, a revenue figure. Many real-world problems instead ask for a category: is this email spam or not? Does this patient have the disease? Will this customer churn?
These are classification problems, and logistic regression is one of the most widely used algorithms for solving them.
Despite its name, logistic regression is a classification algorithm, not a regression algorithm. The name is historical: it builds on the same linear combination of features as linear regression, but passes the result through a function that squashes it into a probability.
Binary classification
The simplest case is binary classification: the target variable takes exactly two values, coded as 0 and 1.
- : the positive class (e.g. spam, disease present, will churn).
- : the negative class (e.g. not spam, disease absent, will not churn).
The goal of logistic regression is to estimate the probability that a given input belongs to the positive class:
Once you have a probability, a prediction is made by applying a threshold (usually 0.5):
Why not use linear regression for classification?
It is tempting to apply linear regression directly: fit a line to 0/1 targets and threshold at 0.5. This fails for several reasons:
- Unbounded outputs. Linear regression can predict values far outside — probabilities of or are meaningless.
- Poor fit. The relationship between features and a binary outcome is inherently non-linear; a straight line is a poor model.
- Violated assumptions. The residuals from fitting a line to 0/1 data are not normally distributed and exhibit severe heteroscedasticity.
Logistic regression solves this by replacing the linear output with a function that always produces a value between 0 and 1.
The logistic regression model
Logistic regression computes a linear score (called the log-odds or logit) and passes it through the sigmoid function :
The sigmoid function and its properties are the subject of the next lesson.
Applications
Logistic regression is used across many domains:
| Domain | Example task |
|---|---|
| Spam vs. not spam | |
| Medicine | Disease present vs. absent |
| Finance | Loan default vs. repayment |
| Marketing | Customer churns vs. stays |
| NLP | Sentiment positive vs. negative |
Key advantages
- Interpretable: each coefficient has a clear meaning in terms of log-odds (covered in the Decision Boundary lesson).
- Probabilistic output: gives a calibrated probability, not just a hard label.
- Efficient: fast to train even on large datasets.
- Strong baseline: often hard to beat with more complex models on tabular data.
Key terms
- Binary classification: predicting one of two classes (0 or 1).
- Positive class (): the class the model is trained to detect.
- Predicted probability (): the model's estimate of .
- Threshold: the cutoff applied to to produce a hard class prediction.
- Log-odds / logit: the linear combination before the sigmoid is applied.