Chapter 3

Eigenvalues & Eigenvectors

When a matrix transforms space, most vectors get knocked completely off their original direction. But certain special vectors — the eigenvectors — are so aligned with the transformation that they only get scaled, never rotated. These directions are the skeleton of the transformation, and they appear everywhere: in Google's search algorithm, in quantum physics, in data science.

Eigenvectors

Suppose you apply a matrix $A$ to many different vectors and watch what happens. Most tumble off their original line — they get rotated to some new direction. But occasionally you find a vector $\mathbf{v}$ that, after being multiplied by $A$, still points in the same direction as before. It might be longer or shorter, or even flipped, but it lands back on the same line through the origin. Such a vector is called an eigenvector of $A$.

Precisely: a nonzero vector $\mathbf{v}$ is an eigenvector of $A$ with eigenvalue $\lambda$ if:

$$A\mathbf{v} = \lambda\mathbf{v}$$

The left side applies the matrix. The right side scales $\mathbf{v}$ by $\lambda$. The equation says these two operations produce identical results — so $A$ acts on $\mathbf{v}$ exactly like multiplication by a number.

  • $\lambda > 1$: the eigenvector is stretched away from the origin
  • $0 < \lambda < 1$: it is shrunk toward the origin
  • $\lambda < 0$: it is flipped to point in the opposite direction
  • $\lambda = 0$: it is collapsed to zero — the matrix is singular
Geometric Picture

Imagine a transformation that stretches, squishes, and rotates the entire plane. Most arrows tumble off their original direction. The eigenvectors are the directions the transformation respects — they define invariant lines through the origin, and on each line the transformation simply acts as multiplication. They are the natural axes of the transformation.

Characteristic Polynomial

How do we find the eigenvalues? Start from $A\mathbf{v} = \lambda\mathbf{v}$ and rearrange. Since $\lambda\mathbf{v} = \lambda I\mathbf{v}$, we get:

$$(A - \lambda I)\mathbf{v} = \mathbf{0}$$

We need a nonzero solution $\mathbf{v}$. A matrix equation $M\mathbf{v} = \mathbf{0}$ has a nonzero solution exactly when $M$ is singular — when $\det(M) = 0$. So we solve the characteristic equation:

$$\det(A - \lambda I) = 0$$

For a $2\times2$ matrix, expanding gives a quadratic in $\lambda$ — the characteristic polynomial. Its roots are the eigenvalues. For each eigenvalue, substitute back and solve $(A - \lambda I)\mathbf{v} = \mathbf{0}$ to find the corresponding eigenvectors.

Two elegant shortcuts let you read off information immediately:

$$\lambda_1 + \lambda_2 = \operatorname{tr}(A) = a_{11} + a_{22}, \qquad \lambda_1 \cdot \lambda_2 = \det(A)$$
Trace & Determinant

The trace (sum of diagonal entries) equals the sum of eigenvalues; the determinant equals their product. These give you the eigenvalue structure at a glance. If $\det(A) = 0$, one eigenvalue is zero — the matrix is singular. If $\operatorname{tr}(A) = 0$ and $\det(A) > 0$, the eigenvalues are a pair of opposite-sign real numbers or a conjugate complex pair — the transformation involves rotation.

Diagonalization

If a matrix $A$ has $n$ linearly independent eigenvectors, something remarkable becomes possible. Arrange the eigenvectors as columns of a matrix $P$, and put the corresponding eigenvalues on the diagonal of $D$. Then:

$$A = PDP^{-1}$$

This is diagonalization. The matrix $P$ is a change-of-basis — it converts coordinates into the eigenvector basis. In that basis, $A$ acts as pure scaling along each axis with no mixing between directions. The matrix $D$ captures exactly that scaling.

The practical payoff is enormous. Computing $A^{100}$ naïvely requires 99 matrix multiplications. Diagonalized, it requires none:

$$A^k = PD^kP^{-1}, \qquad D^k = \begin{pmatrix}\lambda_1^k & 0 \\ 0 & \lambda_2^k\end{pmatrix}$$

Raising a diagonal matrix to a power just raises each diagonal entry — a one-step computation regardless of $k$.

Why It's Everywhere

PCA finds eigenvectors of a covariance matrix to identify the directions of greatest variance in data — the axes of an ellipse fit to a point cloud. Google's PageRank is the dominant eigenvector of the web's link graph. Quantum mechanics represents observable quantities as matrices; the eigenvalues are the only possible measurement outcomes. Differential equations of the form $\dot{\mathbf{x}} = A\mathbf{x}$ are solved by decomposing into eigenvectors. Wherever dynamics, data, or geometry meet, eigendecomposition is there.