Vectors

What a vector is

A vector is an ordered list of numbers. In machine learning, vectors are the fundamental way to represent data points, features, weights, and predictions.

$\mathbf{x} = \begin{pmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{pmatrix}$

A vector in $\mathbb{R}^n$ has $n$ components and lives in $n$ -dimensional space. A training example with 5 features is a vector in $\mathbb{R}^5$ .

Two interpretations

As a point: a location in $n$ -dimensional space. The vector $(3, 2)$ is the point 3 units right and 2 units up from the origin.

As a direction and magnitude: an arrow from the origin to that point, with a direction and a length. This interpretation is more useful for operations like addition and scaling.

Both interpretations coexist and are useful in different contexts.

Vector operations

Addition: add component-wise. Adding two vectors combines their displacements — geometrically, you follow one arrow then the other.

$\mathbf{u} + \mathbf{v} = \begin{pmatrix} u_1 + v_1 \\ u_2 + v_2 \end{pmatrix}$

Scalar multiplication: multiply every component by a number $c$ . Scales the length of the vector without changing its direction (unless $c < 0$ , which flips direction).

$c\mathbf{v} = \begin{pmatrix} cv_1 \\ cv_2 \end{pmatrix}$

Magnitude (norm)

The magnitude or L2 norm of a vector is its length:

$\|\mathbf{v}\| = \sqrt{v_1^2 + v_2^2 + \cdots + v_n^2}$

This is the Euclidean distance from the origin to the point $\mathbf{v}$ .

A unit vector has magnitude 1. Dividing any non-zero vector by its magnitude normalizes it: $\hat{\mathbf{v}} = \frac{\mathbf{v}}{\|\mathbf{v}\|}$ .

Why vectors matter in ML

In ML, everything is a vector:

A training example with $n$ features is a vector $\mathbf{x} \in \mathbb{R}^n$ .
The weights of a linear model are a vector $\mathbf{w} \in \mathbb{R}^n$ .
A prediction is a vector (or scalar — a 1D vector).
A word embedding maps a word to a vector in, say, $\mathbb{R}^{300}$ .

The entire computation of a linear model — predicting $\hat{y} = \mathbf{w}^T \mathbf{x} + b$ — is a vector operation. Training updates the weight vector. Regularization penalizes its magnitude.

Linear combinations and span

A linear combination of vectors $\mathbf{v}_1, \ldots, \mathbf{v}_k$ is any sum $c_1\mathbf{v}_1 + \cdots + c_k\mathbf{v}_k$ for scalars $c_i$ . The set of all possible linear combinations is the span of those vectors.

If you can reach any point in $\mathbb{R}^n$ by linear combinations of a set of $n$ vectors, those vectors span the full space. If they are redundant (one is a multiple of another), they only span a lower-dimensional subspace. This concept underlies matrix rank and linear dependence, covered in the rank lesson.