Essence of Linear Algebra (8): Symmetric Matrices and Quadratic Forms
Chen Kai BOSS

Symmetric matrices are the "nicest" matrices in linear algebra — they have real eigenvalues, orthogonal eigenvectors, and can be perfectly diagonalized. Understanding symmetric matrices is the key to mastering principal component analysis, optimization theory, vibration analysis in physics, and many other fields.

Introduction: The Power of Symmetry

In mathematics and physics, symmetry often heralds beautiful properties. Think of the six-fold symmetry of snowflakes, the mirror symmetry of butterfly wings — these symmetries bring not only visual beauty but also profound physical laws. Linear algebra is no exception: symmetric matrices possess the most elegant properties.

Imagine you're analyzing the energy of a physical system, or optimizing an objective function in machine learning. You'll find that these problems ultimately reduce to analyzing symmetric matrices. Why? Because:

  • Energy is always expressed as a quadratic form:
  • Covariance matrices are always symmetric:
  • Hessian matrices (second derivatives) are always symmetric: Life Analogy: A symmetric matrix is like a mirror — what you see from the left is the same as what you see from the right. This "symmetry" brings tremendous mathematical convenience: simpler computations, more elegant properties, and broader applications.

Basic Properties of Symmetric Matrices

Definition and Intuition

A matrixis symmetric if:That is:for all.

Geometric Meaning: The linear transformation corresponding to a symmetric matrix is "uniform" in some sense — it stretches or compresses along principal axis directions without twisting the space. It's like uniformly stretching a piece of clay with both hands rather than twisting it.

Life Example: Spring System

Imagine two masses connected by three springs. If you push the first mass, it affects the second mass through the springs; conversely, the second mass affects the first in the same way. This "symmetry of mutual interaction" is the essence of symmetric matrices.

Example: Covariance MatrixSince,is symmetric. This tells us: the influence ofonequals the influence ofon.

Special Properties of Symmetric Matrices

Theorem 1: The eigenvalues of a symmetric matrix are all real numbers.

This is a very important property. Eigenvalues of general matrices can be complex (like rotation matrices), but symmetric matrices guarantee all eigenvalues are real.

Proof:

Letbe an eigenvalue,be the corresponding eigenvector (possibly a complex vector).Take the conjugate transpose:Since:Multiply the second equation byon the right, multiply the first equation byon the left:Since, we have, meaningis real. Intuitive Explanation: Complex eigenvalues indicate rotation, but symmetric matrices only stretch/compress without rotating. No rotation = no imaginary part.

Theorem 2: Eigenvectors corresponding to different eigenvalues of a symmetric matrix are mutually orthogonal.

Proof:

Letbe two different eigenvalues,be corresponding eigenvectors.Compute:On the other hand, using:Therefore: Since, we must have.

Life Analogy: Imagine a football field where you can run along the length (eigenvector 1) or along the width (eigenvector 2). These two directions are perpendicular and don't interfere with each other — this is the orthogonality of eigenvectors.

Spectral Theorem

This is the most important theorem about symmetric matrices:

Spectral Theorem: Any real symmetric matrixcan be orthogonally diagonalized:Where: -is an orthogonal matrix (), with unit eigenvectors as columns -is a diagonal matrix, with eigenvalues on the diagonal

Significance: 1. In the basis of eigenvectors, a symmetric matrix appears as a simple diagonal matrix 2. Any symmetric matrix can be viewed as stretching/compressing along orthogonal directions 3. This is the theoretical foundation for Principal Component Analysis (PCA), spectral clustering, and other algorithms

Life Analogy: Imagine you need to move a furniture of unusual shape through a doorframe. The spectral theorem tells you: just find the furniture's "principal axis" directions, and the problem becomes simple — along these directions, the furniture's dimensions are the simplest.

Corollary (Spectral Decomposition): A symmetric matrix can be written as a sum of outer products of eigenvalues and eigenvectors:This is called spectral decomposition.

Why "spectral"? In physics, a spectrum decomposes white light into different colors (frequencies). Similarly, spectral decomposition decomposes a matrix into different "frequencies" (eigenvalues). Each eigenvalue corresponds to a "pure" direction (eigenvector), just as each color corresponds to a pure light wave.

Quadratic Forms

Definition of Quadratic Forms

A quadratic form is a homogeneous polynomial of degree 2 in variables:It can be expressed in matrix form:Whereis a symmetric matrix (we can always take it to be symmetric, since).

Life Example: Energy

In physics, elastic potential energy is a typical quadratic form:For a system with two degrees of freedom:This is a quadratic form! Hererepresents coupling between the two degrees of freedom.

Example 1: Binary quadratic formCorresponding matrix:Note: The cross-term coefficient 4 is split into two 2s to ensure the matrix is symmetric.

Verification:

Geometric Meaning of Quadratic Forms

The geometric meaning of quadratic formdepends on the eigenvalues of:

Case 1:is positive definite (all) - Quadric surface: Ellipsoid - Level curves: Concentric ellipses - Minimum at origin ()

Life Analogy: This is like the shape of a bowl — no matter which direction you go from the center, you go upward. A ball placed at the bottom of the bowl will stably stay there.

Case 2:is negative definite (all) - Quadric surface: Downward-opening bowl - Maximum at origin

Life Analogy: This is like an upside-down bowl — a ball would roll off from the top.

Case 3:is indefinite (some positive, some negative eigenvalues) - Quadric surface: Saddle surface (saddle shape) - Origin is a saddle point

Life Analogy: This is like a horse saddle — curved upward along the horse's back and curved downward along its belly. When riding, you sit at the saddle point, which is neither the highest nor lowest point.

Case 4:is positive semidefinite (, at least one is 0) - Degenerate case: parabolic cylinder, etc. - "Flat" along some direction

Standardization of Quadratic Forms

Goal: Through coordinate transformation, convert the quadratic form to standard form (containing only squared terms).

Principal Axis Theorem: For quadratic form, there exists an orthogonal transformationsuch that:

Steps: 1. Find eigenvaluesof$A_1, , _n = QQ = [_1, , _n]$ Geometric Meaning: The new coordinatesare the principal axis directions of the quadric surface.

Detailed Example:

Consider quadratic form Step 1: Write the matrix

Step 2: Find eigenvaluesEigenvalues: Step 3: Find eigenvectors

For:GetFor:Get Step 4: Orthogonal transformationLet, then:This is the standard form! In the new coordinate system, the ellipse equation becomes.

Classification of Quadratic Forms

Based on the signs of eigenvalues of matrix, quadratic forms are classified as:

Eigenvalue Signs Quadratic Form Name Standard Form Geometric Shape
All positive Positive definite () Ellipsoid
All negative Negative definite Downward ellipsoid
Mixed positive and negative Indefinite Saddle surface
Non-negative, at least one 0 Positive semidefinite Some Degenerate ellipsoid

Positive Definite Matrices

Definition and Criteria

A symmetric matrixis positive definite if for all nonzero vectors:

Geometric Meaning: The quadratic formis always positive (except at origin).

Related Concepts: - Positive semidefinite: - Negative definite: - Indefinite: Takes both positive and negative values

Life Analogy: - Positive definite: Bowl bottom — energy increases in any direction away from origin - Negative definite: Hilltop — energy decreases in any direction away from origin - Indefinite: Saddle — increases in some directions, decreases in others - Positive semidefinite: Flat valley bottom in one direction

Methods for Determining Positive Definiteness

There are several equivalent criteria:

Method 1: Eigenvalue Testis positive definiteAll eigenvalues Proof:

From spectral theorem:Where. Sinceis orthogonal,implies.This sumif and only if all. Method 2: Leading Principal Minor Test (Sylvester's Criterion)is positive definiteAll leading principal minorsLeading principal minors are determinants of upper-left submatrices:

Example: ,Sois positive definite.

Intuition: Sylvester's criterion progressively checks — first the 1D case, then 2D... each step requires "positive energy."

Method 3: Cholesky Decomposition Testis positive definitehas Cholesky decomposition:(is lower triangular with positive diagonal)

Method 4: Energy Test

In physics and engineering, positive definite matrices correspond to positive energy systems. Ifrepresents a system's stiffness matrix,being positive definite means the system is stable.

Life Example: Structural Stability of Buildings

Imagine a bridge. Engineers need to ensure the bridge's stiffness matrix is positive definite — this means under any external force, the bridge produces positive elastic potential energy rather than "collapsing." If the stiffness matrix is not positive definite, the bridge might become unstable in some direction.

Properties of Positive Definite Matrices

  1. Invertible: Positive definite matrices are always invertible (all eigenvalues, determinant)

  2. Positive diagonal entries:(take, get)

  3. Positive determinant:

  4. Inverse is also positive definite: Ifis positive definite, thenis also positive definite

  5. Sum preserves positive definiteness: Ifare positive definite, thenis positive definite

  6. Product not necessarily positive definite:may not be positive definite (unless), butis positive definite

  7. Square root exists: Positive definite matrices have a unique positive definite square root

Principal Axis Theorem and Applications

Principal Axis Theorem

Theorem: For quadratic form(symmetric), there exists an orthogonal transformationconverting it to standard form:Where: -are eigenvalues of - Columns ofare corresponding unit eigenvectors (principal axis directions) -are new coordinates

Geometric Interpretation: - Original coordinates: Quadric surface may be a tilted ellipsoid - Principal axis coordinates: Ellipsoid axes align with coordinate axes - Principal axis lengths determined by

Application 1: Determining Quadric Curve/Surface Type

Problem: What curve is?

Solution:

  1. Matrix:$A =

    (A - I) = (3-)^2 - 1 = 0_1 = 4, _2 = 2$(both positive)

  2. Conclusion: Positive definite, represents an ellipse

  3. Standard form: In principal axis coordinates:That is:This is the standard ellipse equation with semi-major axis, semi-minor axis.

Application 2: Rayleigh Quotient and Optimization

Problem: Find extrema ofsubject to.

Solution: Using Lagrange multipliers:That is:This is exactly the eigenvalue problem!

Conclusion: - Maximum is the largest eigenvalue - Minimum is the smallest eigenvalue - Extremal points are corresponding eigenvectors

This is the result of Rayleigh quotient:

Life Analogy: Imagine an ellipsoid-shaped mountain. You're standing on it and want to know the steepest direction from the peak. The answer is along the ellipsoid's shortest axis — this is the eigenvector direction corresponding to the largest eigenvalue.

Application 3: Covariance Matrix and PCA

In statistics, covariance matrixis a symmetric positive semidefinite matrix. Its eigenvalues and eigenvectors give:

  • Eigenvalues: Variance of principal components (spread of data in that direction)
  • Eigenvectors: Principal component directions (directions of maximum data variation)

Principal Component Analysis (PCA) is just spectral decomposition of the covariance matrix!

Life Example: Face Recognition

Suppose you have a set of face images, each being adimensional vector. Processing such high-dimensional data directly is difficult. PCA helps you find directions of maximum face variation ("eigenfaces"), keeping only the first few principal components for effective dimensionality reduction while preserving most information.

Cholesky Decomposition

Definition

For a positive definite matrix, the Cholesky decomposition is:Whereis a lower triangular matrix with positive diagonal entries.

Uniqueness: Ifis positive definite, Cholesky decomposition exists uniquely.

Difference from Spectral Decomposition: - Spectral decomposition:(orthogonal) - Cholesky:(lower triangular)

Cholesky decomposition is faster (vs) and more suitable for numerical computation.

Intuition: Cholesky decomposition is like "taking the square root" of a positive definite matrix — just as, positive definite matrix.

Computation Method

For amatrix:Expanding gives:

General Formula (computed row by row):

Detailed Example:

Compute Cholesky decomposition of. So.

Verification:

Applications

1. Solving Linear Systems

Solving:

Step 1: Cholesky decompositionStep 2: Forward substitution to solveStep 3: Backward substitution to solveAbout 2x faster than Gaussian elimination! And more numerically stable.

2. Testing Positive Definiteness

If Cholesky decomposition succeeds (no negative numbers under square root), thenis positive definite. In numerical computing, this is the most reliable way to test positive definiteness.

3. Generating Normal Random Vectors

Generatingrandom vectors: - Generate(independent standard normals) - Compute, where Why does this work? Because

Geometry of Ellipses and Hyperbolas

Two-Dimensional Case

The geometric shape of quadratic equationis determined by discriminant:

Discriminant Matrix Property Geometric Shape
Positive or negative definite Ellipse
Indefinite Hyperbola
Semidefinite Parabola/parallel lines

Intuition: This is like determining whether a bowl is right-side up (ellipse), saddle-shaped (hyperbola), or trough-shaped (parabola).

Geometric Properties of Ellipses

Ellipse equation(assuming):

  • Semi-major axis:(alongdirection)
  • Semi-minor axis:(alongdirection)
  • Foci:, where
  • Eccentricity:()

Relationship with matrices:

For(positive definite), ellipse semi-axis lengths are, directions are eigenvectors.

Life Example: Planetary Orbits

Planets orbit the sun in ellipses. The sun sits at one focus of the ellipse. Earth's orbit has eccentricity about, almost circular; Pluto's orbit has eccentricity about, clearly elliptical.

Geometric Properties of Hyperbolas

Hyperbola equation:

  • Real axis:(alongdirection)
  • Imaginary axis:(alongdirection)
  • Foci:, where
  • Asymptotes: Life Example: Supersonic Aircraft

When an aircraft flies supersonically, the shockwave front forms a cone. The intersection of this cone with the ground is one branch of a hyperbola.

Three-Dimensional Case

Three-dimensional quadric surfaces are richer:

Eigenvalue Signs Surface Type Equation (Standard Form)
Ellipsoid
Hyperboloid of one sheet
Hyperboloid of two sheets
Elliptic cylinder
Hyperbolic cylinder

Matrix Square Roots

Definition

For a positive definite symmetric matrix, its square rootis the matrix satisfying:

Existence: The square root of a positive definite symmetric matrix exists and is unique (if we require the square root to also be symmetric positive definite).

Computation Methods

Method 1: Spectral DecompositionWhere.

Method 2: Cholesky Decomposition

If, thenis a square root of(but not symmetric).

Symmetric square root:obtained from spectral decomposition is symmetric.

Application: Whitening Transform

In machine learning, whitening transforms correlated data into uncorrelated data with unit variance.

Given covariance matrix, the whitening transform is:Making(identity matrix).

Why whitening? Many machine learning algorithms assume features are independent with equal variance. Whitening preprocessing satisfies this assumption, improving algorithm performance.

Life Analogy: Imagine analyzing student grade data where math and physics scores are highly correlated. Whitening finds a new coordinate system where the new "scores" are mutually independent — making it easier to analyze each "ability's" independent contribution.

Practical Application Cases

Application 1: Small Vibrations in Physics

In classical mechanics, vibrations of multi-degree-of-freedom systems near equilibrium can be described by quadratic forms.

Kinetic energy:(is mass matrix, positive definite)

Potential energy:(is stiffness matrix, positive definite)

Vibration frequencies: Determined by generalized eigenvalue problem Life Example: Guitar String Vibration

Guitar string vibration can be decomposed into multiple "resonant modes," each corresponding to a characteristic frequency. The fundamental frequency determines pitch, harmonics determine timbre. These frequencies and modes are the eigenvalues and eigenvectors of the stiffness matrix!

Application 2: Ellipse Fitting in Image Processing

Given a set of data points, fitting the best ellipse is a common task in computer vision.

Method: Minimize sum of squared residuals to get quadratic curve equationBy analyzing the matrixof the quadratic part, we can determine whether the fit is an ellipse, hyperbola, or parabola.

Application 3: Regularization in Machine Learning

In Ridge Regression, the objective function is:Solution:Addingensuresis positive definite, hence invertible and numerically stable.

Intuition: The originalmay be nearly singular (large condition number). Addingaddsto each eigenvalue, making the matrix "more positive definite" and more numerically stable.

Application 4: Portfolio Optimization in Finance

In Markowitz portfolio theory, risk minimization:Constraints:(weights sum to 1),(expected return is)

Whereis the covariance matrix of asset returns (positive definite).

Intuition:is portfolio variance (risk). We want to find the minimum-risk combination under a given return target. This is a constrained quadratic programming problem.

Exercises

Basic Problems

1. Determine whether the following matrices are symmetric and positive definite:

(a) (b) (c) 2. For quadratic form, write the corresponding matrixand determine its positive definiteness.

3. Compute the Cholesky decomposition of.

4. Find eigenvalues and eigenvectors ofand write the spectral decomposition.

5. Determine the type of quadric curve(ellipse/hyperbola/parabola).

Advanced Problems

6. Prove: Ifis positive definite, thenis also positive definite.

7. Prove: Ifare bothpositive definite matrices, thenis also positive definite.

8. For quadratic form: - Find eigenvalues of matrix - Determine positive definiteness - Convert to standard form - Sketch level curve 9. Prove Sylvester's criterion:is positive definite if and only if all leading principal minors. (Hint: Use mathematical induction)

10. Prove: The trace of a symmetric matrix equals the sum of eigenvalues, and the determinant equals the product of eigenvalues.

11. Letbe anreal matrix (not necessarily symmetric). Proveis a positive semidefinite matrix. When ispositive definite?

Programming Problems

12. Implement Cholesky decomposition algorithm (without using numpy built-in function):

1
2
3
4
5
6
7
def cholesky(A):
"""
Cholesky decomposition of positive definite matrix A,
returns lower triangular matrix L such that A = L @ L.T
"""
# Your code
pass

13. Verify spectral theorem in Python: For a randomly generated symmetric matrix, compute its spectral decomposition and verify.

14. Implement quadratic form visualization: Given a 2D symmetric matrix, plot its level curves and mark eigenvector (principal axis) directions.

15. Implement PCA dimensionality reduction: - Generate 2D correlated data - Compute covariance matrix - Perform spectral decomposition - Visualize principal component directions - Project onto first principal component

16. Implement whitening transform and verify that whitened data has covariance matrix close to identity.

Application Problems

17.

Vibration Analysis: Two masses (each mass) connected by three springs (stiffnesses). Stiffness matrix: (a) Proveis positive definite (assuming all) (b) Find vibration frequencies and modes

18.

Portfolio: Two assets with covariance matrix, expected returns.

  1. Verifyis positive definite
  2. Find minimum variance portfolio (constraint: weights sum to 1)
  3. Find several points on the efficient frontier

19.

Ellipse Fitting: Given data points, fit an ellipse equation using least squares.

x 1 2 3 2 1 0 -1 0
y 2 2 0 -2 -2 -1 0 1

Thinking Questions

20. Why is the regularization term in machine learning usuallyrather than? Explain from the perspective of quadratic forms and positive definiteness.

21. Why is a covariance matrix always positive semidefinite? When is it positive definite?

22. If matrixis not symmetric, can it be "spectrally decomposed"? What conditions are needed? (This leads to the concept of normal matrices)

23. Does Cholesky decomposition exist for positive semidefinite matrices? If it exists, is it unique?

24. Why is it said that "symmetric matrices are the best matrices"? Analyze from three perspectives: computational complexity, numerical stability, and theoretical elegance.

Exercise Solutions

Basic Problems (1-5)

1. Symmetric and positive definite判定:

(a): Symmetric ✓. Eigenvalues: (both positive) → Positive definite

(b): Symmetric ✓. Eigenvalues: (one negative) → Indefinite

(c): Symmetric ✓. Sylvester:,,Positive definite

2.: Matrix (split cross-term coefficient). Sylvester:, → Positive definite.

3. Cholesky: (solve,,).

4. Eigenvalues:. Eigenvectors (normalized):,. Spectral decomposition:.

5.: Matrix, eigenvalues (both positive) → Ellipse.

Advanced Problems (6-11)

6. Proof: For, let. Then (sincepositive definite).

7. Proof:.

8. Eigenvalues:. Positive definite ✓. Standard form: (ellipse level curve).

9. Sylvester's criterion proof: By induction on. Base case:iff positive definite. Inductive step uses block matrix decomposition.

10. Spectral theorem:, so;.

11. → semidefinite. Positive definite iffhas full column rank (kernel is).

Application Problems (17-19)

17. Vibration: (a)for (energy always positive). (b) Eigenvalues give squared frequencies.

18. Portfolio: (a), → positive definite ✓. (b) Min variance: solvewith. (c) Efficient frontier parametrized by target return.

19. Ellipse Fitting: Use least squares to fitwith constraint (ellipse condition).

Thinking Questions (20-24)

20.is differentiable everywhere (smooth optimization), whilehas non-differentiable point at origin. Quadratic regularization leads to positive definite objective → unique global minimum.

21., so → semidefinite. Positive definite iff no linear dependency among features.

22. General matrices need Jordan normal form. Diagonalizable iff geometric multiplicities equal algebraic multiplicities. Normal matrices () can be unitarily diagonalized.

23. Semidefinite matrices have modified Choleskywheremay have zero diagonal entries. Not unique (unlike positive definite case).

24. (i) Computation: Storeelements, Cholesky is. (ii) Stability: Real eigenvalues, orthogonal eigenvectors (), guaranteed convergence. (iii) Theory: Spectral theorem, clear geometry (ellipsoids), Rayleigh quotient extremal properties.

Chapter Summary

Key Takeaways

  1. Special properties of symmetric matrices:
    • Eigenvalues are all real
    • Eigenvectors are mutually orthogonal
    • Can be orthogonally diagonalized
  2. Spectral Theorem: - One of the most important decomposition theorems
    • Reveals the essential structure of symmetric matrices
  3. Positive Definite Matrices:
    • Definition: - Tests: All eigenvalues positive / All leading minors positive / Cholesky decomposition exists
    • Meaning: Positive energy, stable system, invertibility
  4. Quadratic Forms:
    • Standard form: - Classification: Positive definite, negative definite, indefinite, semidefinite
    • Geometry: Ellipsoids, saddle surfaces, and other quadric surfaces
  5. Principal Axis Theorem:
    • Finds principal axis directions of quadric surfaces
    • Applications in optimization, physics, data analysis
  6. Cholesky Decomposition: - Fast decomposition for positive definite matrices
    • Numerically stable, computationally efficient

Next Chapter Preview

"Singular Value Decomposition (SVD)"

  • "Spectral decomposition" for arbitrary matrices (not just symmetric)
  • SVD generalizes the spectral theorem for symmetric matrices
  • Pseudoinverse, best low-rank approximation
  • Applications: Image compression, recommender systems, natural language processing

SVD is called the "crown jewel of linear algebra"— we'll see why!

References

  1. Strang, G. (2019). Introduction to Linear Algebra. 5th ed. Chapter 6.
    • Classic treatment of symmetric matrices and positive definiteness
  2. Horn, R. A., & Johnson, C. R. (2012). Matrix Analysis. 2nd ed. Cambridge University Press.
    • In-depth matrix theory including various positive definiteness criteria
  3. Boyd, S., & Vandenberghe, L. (2004). Convex Optimization. Cambridge University Press.
    • Applications of positive definite matrices in optimization
  4. Golub, G. H., & Van Loan, C. F. (2013). Matrix Computations. 4th ed. Johns Hopkins University Press.
    • Authoritative reference for Cholesky decomposition and other numerical algorithms
  5. 3Blue1Brown. Essence of Linear Algebra series. YouTube.
    • Excellent visualizations of eigenvalues and quadratic forms

Next Chapter: Singular Value Decomposition

Previous Chapter: ← Orthogonality and Projections


This is Chapter 8 of the 18-part "Essence of Linear Algebra" series.

  • Post title:Essence of Linear Algebra (8): Symmetric Matrices and Quadratic Forms
  • Post author:Chen Kai
  • Create time:2019-02-12 10:00:00
  • Post link:https://www.chenk.top/chapter-08-symmetric-matrices-and-quadratic-forms/
  • Copyright Notice:All articles in this blog are licensed under BY-NC-SA unless stating additionally.
 Comments