Essence
of Linear Algebra (1): The Essence of Vectors - More Than Just
Arrows
Introduction:
Vectors — The Universal Language of Modern Science
When you first encounter vectors, your teacher might tell you: "A
vector is an arrow" or "A vector is an ordered list of numbers." Both
statements are correct, but they only scratch the surface.
The real question is: Why do physicists, engineers,
data scientists, quantum physicists, and even economists all use the
same mathematical concept — vectors? This is no coincidence.
Vectors are so ubiquitous because they capture a profound essence:
linearity. Our universe, in many cases, follows the
principle of "superposition": - The superposition of two small
displacements equals the total displacement (geometry) - The
superposition of two small signals equals the composite signal (signal
processing) - The superposition of two quantum states is still a quantum
state (quantum mechanics)
This "superposition" — or linearity — is the core of
vector spaces.
The Philosophical
Stance of This Chapter
We will understand vectors progressively through four
levels:
Phenomenal level: Vectors as concrete physical
quantities and data
Geometric level: Vectors as points and arrows in
space
Algebraic level: Vectors as objects satisfying
operational rules
Abstract level: The axiomatic structure of vector
spaces
Each level builds upon the previous one but transcends its
limitations. Ultimately, you'll see that vectors are not some concrete
"thing," but a mathematical pattern — a unified
framework for describing "objects that can be linearly superposed."
Why Do We Need This Depth?
Surface-level understanding is sufficient for exams, but profound
understanding enables you to: - Recognize hidden vector space structures
when encountering new problems - Integrate knowledge across different
fields (quantum states, signals, data are all vectors) - Understand why
certain algorithms work (because they respect vector space
structure)
Let's begin this journey.
The
Geometric Perspective: The Space Where Vectors Live
The Birth of
Vectors: From Points to Arrows
Let's start with a concrete scenario. Imagine you're standing at the
center of a park (the origin), and your friend tells you: "Walk 3 steps
north, then 4 steps east." This instruction is a
vector!
Why is this a vector? Because it contains two key pieces of
information: - Direction: First north, then east (or
overall, a northeast direction) - Displacement: The
distance from the starting point to the ending point
Mathematically, we write this vector as:
Here, 4 represents the component in the east (x)
direction, and 3 represents the component in the north (y)
direction.
The Length (Magnitude) of a
Vector
After walking these steps, how far are you from the origin? This is
the vector's magnitude or length:Isn't this just the Pythagorean theorem? Exactly!
The magnitude of a vector is the length of the hypotenuse of a right
triangle.
The Direction of a Vector
The direction of a vector can be expressed using an
angle. If we take the positive x-direction (east) as 0
degrees and counterclockwise as positive:So your friend actually asked you to walk 5 steps in a
direction about 37 degrees north of east.
Translation Invariance of
Vectors
Here's an important concept: vectors don't care where they
start.
Whether you depart from the park center or the northeast corner of
the park, the instruction "4 steps east, 3 steps north" represents the
same vector. This is the translation invariance of
vectors.
Imagine you're on a ship sailing east at some speed. Whether the ship
is in the middle of the Pacific Ocean or in the Mediterranean Sea, the
velocity vector is the same — direction is east, magnitude is a specific
speed value. The ship's position changed, but the velocity vector
didn't.
This property is extremely important in physics. Force, velocity, and
acceleration are all vectors, and they don't depend on specific
positions, only on direction and magnitude.
Vector Addition:
Multiple Ways of Understanding
Vector addition is perhaps the most important vector operation. Let
me explain it from three angles.
Angle 1: Head-to-Tail Method
Suppose you first move according to vector, then move according to
vector. What's your total
displacement?
The answer is: place the starting point ofat the endpoint of, then draw an arrow from the
starting point ofto the
endpoint of. This new arrow
is.
Example: You first walk 3 steps east and 4 steps
north (vector), then walk 1
step east and 2 steps north (vector).The total displacement is 4 steps east and 6 steps
north.
Angle 2: Parallelogram Rule
If you draw bothandfrom the origin, then use them as
adjacent sides to draw a parallelogram, the diagonal from the origin
is.
This rule is widely used in physics. For example, when two forces act
on an object simultaneously, the resultant force is the diagonal of the
parallelogram formed by these two force vectors.
Angle 3: Component-wise
Addition
From an algebraic perspective, vector addition is just adding
corresponding components:These three ways of
understanding are completely equivalent but have different advantages in
different scenarios. Geometric intuition helps you build spatial
awareness, while algebraic methods are convenient for computation.
Scalar
Multiplication: Stretching, Compressing, and Reversing
When we say "", what
does it mean?
Geometrically,is a
vector with the same direction asbut twice the length.
More generally, for scalarand
vector:
When:is a
"stretched" version of, same
direction, longer length
When:is a
"compressed" version of,
same direction, shorter length
When:, yielding the zero vector
When:
Direction reverses, length becomestimes
Real-life Example:
Imagine you're driving at velocity. -: Double the speed (driving
faster) -: Half the speed
(driving slower) -: U-turn!
Same speed magnitude but opposite direction
Algebraically, scalar multiplication is multiplying each component by
the scalar:
Vector
Subtraction: The Directional Difference
Vector subtraction can be understood as "the displacement from one
position to another."
Ifandare two position vectors (from the
origin to points A and B), then:is the vector pointing from point A to point
B.
Key insight:tells you "how to get fromto."
This is crucial in computer graphics. For example, to calculate the
direction from a gun barrel to a target, you need to subtract the gun
barrel position from the target position.
The
Numerical Perspective: Vectors as Data Containers
Beyond Two and Three
Dimensions
So far, we've been discussing 2D or 3D vectors that can be
visualized. But the true power of vectors lies in their ability to
generalize to arbitrary dimensions.
An-dimensional vector is an
ordered list ofnumbers:Although we can't visually "see"
high-dimensional vectors, all the operational rules (addition, scalar
multiplication, inner product, etc.) apply exactly the same way.
Vectors Are Everywhere:
Real-World Cases
Case 1: Weather Data
The weather conditions at a certain place and time can be represented
as a vector:Where the
components are: - Temperature: 25.3° C - Humidity: 65.0% - Pressure:
1013 hPa - Wind speed: 15.2 km/h - Cloud cover: 45%
This way, weather becomes a 5-dimensional vector. If we collect data
from multiple days, we get a set of vectors that can be used to analyze
weather patterns and predict future weather.
Case 2: Images Are Huge
Vectors
Apixel grayscale
image (like handwritten digit images) can be "flattened" into a
784-dimensional vector. Each component is the brightness value of a
pixel (0-255).
This is why machine learning can process images — it treats images as
vectors and uses vector operations to analyze and classify them!
1 2 3 4 5 6 7 8 9 10 11 12 13 14
# A simple example import numpy as np
# Suppose we have a 3x3 grayscale image image = np.array([ [0, 128, 255], [64, 192, 32], [100, 50, 200] ])
This example reveals a profound insight: similarity can be
measured using inner products!
Case 4: Word
Vectors in Natural Language Processing
Modern NLP represents each word as a vector (usually 100-300
dimensions). Amazingly, these vectors can capture semantic
relationships:This means the relationship
between "king" and "queen" is similar to the relationship between "man"
and "woman"!
2.3
The Inner Product: Unifying Geometry, Algebra, and Philosophy
The inner product is one of the most profound concepts in linear
algebra. It is not just a computational tool, but a bridge connecting
geometry and algebra.
Three-Layer Definition
of Inner Product
Layer 1: Computational Definition (Algebra)This is the
computational method, but it doesn't tell you "why."
Layer 2: Geometric DefinitionThis remarkable formula reveals the
geometric essence of the inner product: it measures the "alignment" of
two vectors.
Layer 3: Axiomatic Definition (Most Abstract)
An inner product is a bilinear mappingsatisfying the following axioms:
Positive definiteness:,
and
Symmetry:
Linearity (first argument):Any "multiplication" satisfying these three
axioms is an inner product!
Cauchy-Schwarz
Inequality: A Profound Constraint on Inner Products
Theorem (Cauchy-Schwarz): For any vectors:Equality holds if and only ifare collinear.
Proof (Elegant algebraic technique):
Consider the functionfor all.
Expanding:This is a quadratic function
in. Sincealways holds, the discriminant
must be:Simplifying yields the
Cauchy-Schwarz inequality.
Deep meaning:
This inequality states that the inner product of two vectors can
never exceed the product of their lengths. Geometrically,is obvious, but the
algebraic proof reveals deeper structure — this is a necessary
consequence of the inner product axioms!
Triangle
Inequality: The "Detour Theorem" for Vectors
Theorem:
Geometric meaning: The sum of two sides of a
triangle is greater than the third side (taking a detour is longer than
going straight)
Proof (using Cauchy-Schwarz):Taking square roots yields the result.
Orthogonality:
Geometrization of Independence
Two vectorsare
orthogonal (written) if and only if.
Why is orthogonality so important?
Geometry: Orthogonal vectors are "completely
independent," not interfering with each other
Probability: Independent random variables
correspond to orthogonal vectors (covariance=0)
Physics: Components in orthogonal directions can be
treated independently
Deep insight: Orthogonality is the mathematization
of "irrelevance." When two vectors are orthogonal, any change in one
vector doesn't affect the projection onto the other vector's
direction.
Projection:
The Geometric Form of Best Approximation
The projection of vectoronto:
Profound theorem:is the
multiple ofclosest to (minimizing).
Proof: Let,
then:By the Pythagorean theorem (for orthogonal
vectors):Thereforeminimizes the distance.
Philosophical meaning: Projection is not just a
"shadow," it's the prototype of best linear
approximation. Least squares, PCA, and signal filtering are all
extensions of this idea!
Metric Space Induced by
Inner Product
With the inner product, we can define: - Norm: - Distance: - Angle:These three concepts
naturally emerge from the inner product, forming a complete geometric
structure. This is why inner product spaces (Hilbert spaces) are the
foundation of modern analysis!
Application of Inner
Product: Projection
If you want to know how much of vectorlies in the direction of
vector, you can compute the
projection:Projection
has applications in many fields: - Physics: The component of force in a
certain direction - Computer graphics: Shadow calculations - Machine
learning: The core of least squares
2.4 Vector Norms:
The Philosophy of Measuring Size
Besides the familiar "length" (2-norm), vectors have other ways of
measuring "size." But why are there multiple notions of "size"? Behind
this lies profound mathematical and philosophical reasons.
Axiomatic Definition of
Norms
A functionis a norm if and only if it
satisfies three axioms:
Positive definiteness:, and
Homogeneity:
(stretching the vector also stretches its length)
Triangle inequality: (going straight is no farther than taking a
detour)
Any "measurement method" satisfying these three properties is a
legitimate norm!
Norm (General
Form):When, we get the three special
cases above.
Norm Equivalence Theorem
Profound theorem: In finite-dimensional space, any two normsandare
equivalent, i.e., there exist constantssuch that:
Meaning: Although the "size" given by different
norms has different numerical values, they are qualitatively consistent.
A vector that is "large" under one norm is also "large" under other
norms.
Why is this important? Because it guarantees that
topological properties like convergence and continuity don't depend on
the choice of norm!
Unit Ball: The
Geometric Fingerprint of Norms
Different norms have differently shaped "unit balls" (all vectors of
length 1):
These shapes reflect the essential characteristics of the norms!
Why Do We Need Different
Norms?
Mathematical reason: Different norms induce
different geometric structures Practical reason:
"Optimal solutions" to different problems have different properties
under different norms -: Smooth,
differentiable, but doesn't encourage sparsity -: Non-smooth, but produces sparse
solutions (many components are 0) -: Most sensitive to outliers
Deep philosophy: The choice of norm reflects our
value judgment of "what is important"
Three:
The Abstract Perspective: Axiomatic Vector Spaces — The Power of
Mathematics
So far, we've been discussing "a column of numbers" as the concrete
type of vector. But true mathematical depth lies in
abstraction.
3.1 Why
Do We Need Axiomatization? Philosophical Foundation
In the late 19th century, mathematicians faced a dilemma: similar
"linear structures" appeared in geometry, algebra, and analysis, but
they looked completely different: - Vectors in geometry are arrows -
Vectors in algebra are arrays - Functions in analysis also have similar
properties
Key insight (Hilbert, Banach, early 20th
century):
These seemingly different objects actually follow the same
structural rules. If we distill these rules as "axioms," then
all objects satisfying the axioms can be unified!
This is the power of the axiomatic method: from
concrete to abstract, from specific to general.
3.2 Rigorous Definition of
Vector Spaces
A Vector Spaceoveris a set equipped with two
operations:
Vector addition:
Scalar multiplication:Must
satisfy the following ten axioms:
Let,
Addition structure (Abelian group):
Closure:
Commutativity:
Associativity:
Zero element exists:, such that
Inverse element exists:, such that
Compatibility of scalar multiplication with
addition:
Scalar multiplication closure:
Scalar multiplication distributivity:
Field distributivity:
Associativity:
Identity:
Key observation: These axioms are not chosen
arbitrarily! They are the minimal common structure
distilled from numerous concrete examples.
3.3
Surprising Corollary: Uniqueness of the Zero Vector
Theorem: The zero vector in a vector space is
unique.
Proof: Suppose there are two zero vectorsand.
Bybeing a zero
vector:Bybeing
a zero vector:By commutativity:Therefore.
Philosophical meaning: This simple theorem
demonstrates the power of axioms — from ten rules, we can derive new
facts!
3.4 Unexpected
Vector Spaces — Mathematics' Unity
Example
1: Continuous Function SpaceAll continuous functions defined
onform a vector space!
Addition:
Scalar multiplication:
Zero vector: (function
identically zero)
Inner product definition:This is an infinite-dimensional vector
space! Functioncan be
seen as a "continuously indexed" vector, where eachcorresponds to a
"component".
Deep connections: - Fourier series: Decomposing
functions into "linear combinations" of trigonometric functions -
Orthogonal polynomials: Legendre, Chebyshev polynomials form orthogonal
bases - Quantum mechanics: Wave functions live in infinite-dimensional
Hilbert spaces
Example
2: In-Depth Look at Polynomial SpaceThe space of polynomials with
degree, denoted, is an-dimensional vector space.
Standard basis:Polynomialcorresponds to coordinates.
Another basis (Lagrange basis):Where (Kronecker delta)
Why multiple bases? Different bases suit different
problems! - Monomial basis: Convenient for differentiation - Lagrange
basis: Convenient for interpolation - Chebyshev basis: Optimal for
approximation
Example 3:
Matrix SpaceStructure
The space ofmatrices
is-dimensional, but it has
additional structure:
Frobenius inner product:
Special subspaces: - Symmetric matrices:, dimension - Skew-symmetric
matrices:,
dimension -
Orthogonal matrices: (not a linear space! Why?)
Deep observation: (orthogonal direct sum),
any matrix can be uniquely decomposed into symmetric + skew-symmetric
parts.
Example 4: Deep
Structure of Solution Spaces
The solution set of homogeneous linear systemis a vector space (the
null space).
Key theorem (Rank-Nullity theorem):This
profound theorem states: - Matrix rank (column space dimension) - Null
space dimension - Sum of the two equals number of columns
Non-homogeneous equationsolution set is
not a vector space (doesn't contain zero vector), but
it's an affine space:One particular solution + null space = all solutions
Example 5:
Quantum State Space (Abstraction in Physics)
In quantum mechanics, a particle's state is
represented by a unit vector in a complex vector space (Hilbert
space).
Superposition principle: Ifandare possible states,
thenis also a possible state ().
This is the "weird" aspect of quantum mechanics — state
superposition! Schrödinger's cat is simultaneously in a superposition of
"alive" and "dead" states.
Inner product (Dirac notation):Measures "similarity" or "transition probability" between two
states.
3.5 Why Is Abstraction So
Powerful?
The power of abstraction lies in:
Unity: Prove once, apply infinitely
Transferability: Insights from one domain can
transfer to another
Predictive power: Axioms can predict undiscovered
properties
Concrete example: Cauchy-Schwarz inequality holds
for all inner product spaces: - Numerical vectors: - Function spaces: - Random variables:
(covariance inequality)
Same theorem, three different domains! This is the
power of abstraction.
3.6
From Vector Spaces to Inner Product Spaces to Hilbert Spaces
Mathematical abstraction is layered:
Vector Space → Only has addition and scalar
multiplication ↓ Add inner product Inner Product Space
→ Has geometric concepts (length, angle) ↓ Add completeness
Hilbert Space → Limit processes converge (OK even in
infinite dimensions)
Each layer is richer than the previous. Quantum mechanics needs
Hilbert spaces because wave functions are infinite-dimensional!
Four:
Deep Applications: The Central Role of Vector Thinking in Modern
Science
Vectors are not just mathematical tools, but a way of thinking in
modern science. Let's see how vectors play a role in different
domains.
4.1 Quantum
Mechanics: State Vectors and Superposition
In quantum mechanics, every physical state is a vector in Hilbert
space.
Spin-1/2 particle state space:Hereandare orthogonal basis vectors
("spin up" and "spin down").
Profound aspects: - Before measurement, the particle
is in a superposition state (simultaneously "up" and "down") - After
measurement, the state "collapses" to a basis vector -is the probability of
measuring
Mathematical structure: - State space: Complex
vector space -
Observables: Hermitian operators (matrices) - Evolution: Unitary
transformation
(preserves inner product)
4.2 Signal Processing:
Time Series as Vectors
A digital signal of length,, is a vector in.
Vector space interpretation of Fourier
transform:
Signal can be decomposed into orthogonal frequency components:This is the representation of vectorin the "Fourier basis"!
Physical meaning of inner product:Measures "similarity" or "correlation" of two
signals.
Applications: - Audio compression (MP3): Remove
small Fourier coefficients - Image denoising: Keep main frequency
components - Communication systems: Signal detection and matched
filtering
4.3 Machine
Learning: Feature Vectors and Classification
In supervised learning, each sample is a vector.
Example: Handwritten Digit Recognition
Aimage 784-dimensional vector
Linear classifier:Decision boundary is
hyperplane
Geometric interpretation: -: Normal vector of hyperplane
-: Projection
ofin direction of - Classification: See which side
of hyperplaneis on
Support Vector Machine (SVM):
Find hyperplane that maximizes "margin," transformed into
optimization problem:Pure vector geometry!
4.4 Optimization Theory:
Gradient Vectors
In optimization problems, the gradient is the core concept.
Definition: The gradient of functionis the vector:
Profound property:points in the direction of steepest increase of!
Proof (directional derivative):Whenis in the same direction as (), the derivative is
maximized.
Gradient descent method:Walking in the opposite direction of the gradient,
function value decreases fastest!
Backpropagation in deep learning: Essentially the
chain rule for high-dimensional vectors, computing the gradient vector
of the loss function with respect to millions of parameters.
4.5 Economics: Leontief
Input-Output Model
Consider an economy withsectors. Let: -: Total
output of each sector -: Final demand -: Input-output matrix ( = input from sectorneeded for sectorto produce 1 unit)
Balance equation:Solving:
Economic interpretation: To satisfy final
demand, each sector needs to
produce
(accounting for inter-sector dependencies).
This is a vector equation solving a real economic problem!
4.6 PageRank:
Vector Algorithm for Web Page Ranking
Google's PageRank algorithm transforms web page ranking into an
eigenvector problem.
Withweb pages, construct
transition matrix:
PageRank vectorsatisfies:This is the
eigenvector for eigenvalue!
Deep meaning: PageRank is the stationary
distribution — the probability that a random walker stays on each page
in the long term.
4.7 Biology: Gene Expression
Profiles
In genomics, a cell's "state" can be described by a gene expression
vector:Each componentis the expression level of gene (mRNA count).
Applications: - Clustering: Cells
with similar expression profiles cluster together (cancer cells vs.
normal cells) - Dimensionality reduction: Use PCA to
reduce 20000 dimensions to 2-3 for visualization - Differential
analysis: Compare expression vector differences between two
sample groups
Biological meaning of inner product:Measures similarity in gene
expression patterns between two cells.
Five:
Historical and Philosophical: Evolution of the Vector Concept
5.1 From
Geometric Intuition to Algebraic Form (1800-1900)
The concept of vectors went through a long and tortuous
evolution.
Early: Geometric Phase - Euler,
Gauss (18th century): Used "directed line segments" to
represent force and velocity, but without systematic algebra -
Geometric representation of complex numbers (Wessel,
Argand, 1797-1806): The complex plane hinted at the possibility of
two-dimensional vectors
Revolutionary Breakthrough: Three Pioneers
Hamilton (1843): Invention of quaternions
Attempted to generalize complex numbers to three dimensions
Discovered he had to abandon commutativity:, but
Vectors are the "purely imaginary part" of quaternions
Defined dot product and cross product (though with different
names)
Grassmann (1844): Exterior Algebra
Die Ausdehnungslehre (The Theory of Extension)
Closest to modern vector space thinking
Defined-dimensional vectors,
linear independence, basis, dimension
Too ahead of his time, almost no one understood at first
Gibbs (1881-1884): Modern vector notation
Distilled three-dimensional vectors from Hamilton's quaternions
Introducednotation, (dot product) and (cross product)
Wrote Vector Analysis textbook, spreading to physicists and
engineers
Why did Gibbs' notation become popular? - Simple and
practical, aligned with physical intuition - Maxwell's equations are
more concise written with vectors - Widely adopted by engineering
community
5.2 Axiomatization and
Abstraction (1900-1930)
Early 20th Century: Rise of Structuralist
Mathematics
Hilbert (1900s): Axiomatic method, from geometric
axioms to vector space axioms
von Neumann (1930s): Axiomatization of Hilbert
spaces, laying foundation for quantum mechanics
Key transition: From "what are vectors" to "what
rules do vectors satisfy"
This is a paradigm shift in the history of mathematics —
structuralism replaced essentialism.
We no longer ask "what is the essence of vectors," but ask "what kind of
objects can be treated as vectors."
5.3
Philosophical Reflection: Why Is Linearity So Universal?
Question: Why do physics, engineering, and data
science all use vectors?
Levels of answers:
Level 1: Pragmatism - Linear models are simple, easy
to compute - Nonlinear problems can be locally linearized (Taylor
expansion)
Level 2: Mathematical Structure - Linear structure
is the "simplest non-trivial structure" - Only need addition and scalar
multiplication to build rich theory
Level 3: Nature's Secret - Superposition principle:
Many physical laws are linear - Superposition of waves (light, sound,
water waves) - Superposition of quantum states - Superposition theorem
in circuits - Why does nature prefer linearity? -
Energy minimization principle often leads to linear equations
(variational methods) - Symmetry + conservation laws lead to linear
structure (Noether's theorem)
Level 4: Philosophical Conjecture - Perhaps
"linearity" is our way of knowing the world, not the essence of the
world? - Kant: Space itself is an a priori form of human intuition -
Modern view: Mathematical structures are products of interaction between
human mind and nature
5.4
Multiple Personalities of Vectors: Evolution of Notation
Notation in different disciplines:
Discipline
Notation
Reason
Physics
or
Emphasizes geometric properties
Engineering
Convenient for handwriting
Computer Science
v or vec
Code doesn't support special symbols
Quantum Physics
Dirac's bra-ket notation
Mathematics
or
Concise abstraction
Each notation reflects a different way of
thinking!
5.5 The Future:
The Vector Concept Is Still Evolving
Current frontiers:
Infinite-dimensional vector spaces: Function
spaces, probability spaces
Category theory perspective: Vector spaces as
objects in certain categories
New applications: - Deep learning: Vector embeddings
(word2vec, BERT) - Quantum computing: Quantum states are vectors,
quantum gates are matrices - Data science: Geometric structure of
high-dimensional data
The story of vectors continues...
Practical Applications
Simplified GPS Positioning
Principle
How does GPS determine your position? The core is
trilateration.
Suppose there are three satellites on a 2D plane at positions. Your
phone's received signal tells you that your distances to the three
satellites arerespectively.
Your positionsatisfies:These three equations correspond to three circles. The
intersection of the three circles is your position!
Real GPS uses 4 satellites (because it's 3D space) and also considers
clock errors, but the basic principle is the same.
Game Physics Engines
In games, object motion is described by vectors:
1 2 3 4 5 6 7 8 9 10 11
# Object state position = np.array([100.0, 200.0]) # Position vector velocity = np.array([5.0, -2.0]) # Velocity vector acceleration = np.array([0.0, -9.8]) # Acceleration (gravity)
# Time step dt = 0.016# About 60fps
# Update physics state velocity = velocity + acceleration * dt # v' = v + a*dt position = position + velocity * dt # p' = p + v*dt
This is the discretized version of Newton's mechanics! All physics
simulations are built on vector operations.
Color Spaces
Colors in computers are usually represented as 3-dimensional vectors
(RGB):
1 2 3 4 5 6 7 8
red = np.array([255, 0, 0]) blue = np.array([0, 0, 255])
Misconception 1:
Vectors Must Start from the Origin
Wrong: Vectors must start from point (0,0).
Correct: Vectors only have direction and magnitude,
no fixed position. The same vector drawn from any point is "the same"
(translation invariance).
Misconception
2: Vectors Are Just Lists of Numbers
Partially correct: In a coordinate system, vectors
can be represented as lists of numbers. But the essence of "vector" is
an abstract concept, and numbers are just one way of representing them.
Functions, polynomials, and even more abstract mathematical objects can
all be vectors.
Misconception
3: Inner Product and Cross Product Are Similar Operations
Wrong: They are completely different!
Inner product (dot product): The result is a
scalar,
Cross product: The result is a
vector,Moreover, the cross product is only
defined in 3-dimensional space (or there's a generalization in
7-dimensional space).
Misconception
4: The Zero Vector Has No Direction
Correct but needs attention: The direction of the
zero vectoris indeed
"undefined." It's the only vector without a direction. Sometimes this
leads to edge cases in formulas that need special handling.
Six: Summary and Deep
Insights
Core Insights of This
Chapter
1. Three Levels of
Understanding
Phenomenal level: Vectors are data, physical
quantities, geometric objects Structural level: Vectors
follow linear rules (additivity, homogeneity) Abstract
level: Vector spaces are algebraic structures satisfying
axioms
2. The Central Position
of Inner Product
Inner product is not just an operation, it endows vector spaces with
geometric structure: - Length (norm): - Angle (orthogonality): - Distance
(metric):
Without inner product: Vector space is only an
algebraic structure With inner product: Vector space
becomes a geometric space (inner product space, Hilbert space)
3. Philosophy of Linearity
Linearity means: - Additivity: - Homogeneity:These two
conditions seem simple, but embody profound symmetry: - The whole equals
the sum of parts (no "emergence") - Scaling input is equivalent to
scaling output (scale invariance)
Nonlinear: Has interaction, emergence, chaos
Linear: Predictable, superposable, decomposable
Recognize patterns: See vector space structure in
new problems
Can these objects be "added"?
Is there a "zero element"?
Is there an "inner product"?
Transfer knowledge: Apply techniques from one
domain to another
Fourier analysis in signal processing → image compression
Hilbert spaces in quantum mechanics → kernel methods in machine
learning
Gradients in optimization → backpropagation in deep learning
Appreciate beauty: The unifying beauty and
conciseness of mathematics
One axiomatic system, infinite applications
Perfect fusion of geometric intuition and algebraic precision
Path to Subsequent Chapters
Having understood vectors (single objects), we will explore:
Chapter 2: Linear Combinations and Vector Spaces -
How to "construct" entire spaces with vectors? - What is the essence of
dimension? - Why is linear independence important?
Chapter 3: Matrices as Linear Transformations -
Matrices are not "number tables," but "space transformations" - How to
view rotation, stretching, projection with matrices?
Chapter 6: Eigenvalues and Eigenvectors - Why do
certain vectors maintain direction under transformation? - What does
this have to do with the "essence" of matrices?
Chapter 9: Singular Value Decomposition (SVD) - Any
matrix can be decomposed into rotation+stretching+rotation - This is the
core of PCA, recommendation systems, image compression
Each step builds on the foundation of vectors. Deeper roots,
taller edifice.
Final Thoughts
Vectors are not just mathematical tools, but a way of seeing
the world.
When you see: - An image, think: this is a million-dimensional vector
- A piece of music, think: this is a time series vector - A
recommendation result, think: this is the inner product of user and item
vectors - A physical phenomenon, think: this is the evolution of a
vector field
You truly understand the essence of vectors.
Let's continue this journey.
Exercises
Basic Computation Problems
Vector operations: Givenand, compute:
(a)
(b)
(c)
(d),,
The angle betweenand (using arccosine)
Projection computation:
Compute the projection of vectoronto
Verify that the residual vectoris orthogonal to
Verify using Pythagorean theorem:
Orthogonality determination: Determine if the
following vector pairs are orthogonal:
(a)and
(b)and
(c)and
Theoretical Proof Problems
Application of Cauchy-Schwarz inequality:
Use Cauchy-Schwarz to prove: For any,
Prove:
Apply to functions: Prove
Triangle inequality:
Prove: (reverse triangle inequality)
Give geometric interpretation of the above inequality
When does equality hold?
Properties of norms:
Provenorm satisfies
triangle inequality
Prove for any vector:
Find vectors where some inequality reaches equality
Vector space verification:
Prove all polynomials of degree,, form a vector space
What is the dimension of?
Give a basis
Are the polynomialslinearly independent in?
Advanced Application
Problems
Cosine similarity in machine learning: User
ratings for 5 movies:
Alice:
Bob:
Carol:
Compute cosine similarity between Alice and Bob, Carol
Who is more similar to Alice?
If ignoring movies rated 0, recompute (considering only commonly
rated)
Design an algorithm to predict Alice's rating for unwatched
movies
Least squares fitting: Given data points, find
best-fit line.
Hint: This is equivalent to finding the projection ofonto column
space, where:
Write normal equations
Solve to get
Compute residual
Quantum state superposition: Consider a
two-level system (spin-1/2), basis states areand.
Verify normalization of state
Compute projection length ofin direction of (probability amplitude of
measuring)
What is the probability of measuring?
If, verifyandare orthogonal
Deep Thinking Problems
Specialness of the zero vector:
The zero vector's inner product with any vector is 0. Does this mean
the zero vector is orthogonal to all vectors?
What is the direction of the zero vector? Why do we say it has "no
direction"?
In the projection formula, why must we exclude?
Different definitions of inner product: In, define "weighted inner
product":
Verify this satisfies the three inner product axioms
Under this inner product, areandstill
orthogonal?
Compute length of vectorunder this inner product
Draw the "unit circle" (all vectors of length 1)
Functions as vectors in depth: In (continuous function space),
inner product is defined as
Verifyandare orthogonal
Compute "length"of
Decompose functioninto
constant part and "centered" part (part orthogonal to constant
function)
What is the relationship to "demeaning" in data analysis?
Visualization: Project multiple image vectors into 2D space
Gram-Schmidt orthogonalization: Implement
Gram-Schmidt algorithm, input a set of linearly independent vectors,
output orthogonal (or orthonormal) vectors:
1 2 3 4 5 6
defgram_schmidt(vectors): """ Input: list of linearly independent vectors Output: list of orthonormal vectors """ # Your code here
Test: Input, output orthogonal basis.
Simplified PageRank: Given web page link
relationships (adjacency matrix), compute PageRank:
Construct transition matrix
Use power iteration to find principal eigenvector
Visualization: Node size represents PageRank value
Principal Component Analysis (PCA) preview:
Generate 2D data (noisy line)
Compute covariance matrix
Find principal direction (maximum variance direction, hint: this is
eigenvector of covariance matrix)
Visualize original data and principal direction
References
Textbooks
Strang, G. (2019). Linear Algebra and Learning from Data.
Wellesley-Cambridge Press. — MIT Linear Algebra course textbook
Boyd, S., & Vandenberghe, L. (2018). Introduction to Applied
Linear Algebra. Cambridge University Press. — Application-oriented
introductory book
Axler, S. (2015). Linear Algebra Done Right. Springer. —
Theory-oriented classic textbook
Videos
Sanderson, G. (2016). Essence of Linear Algebra.
3Blue1Brown YouTube Series. — Best visualized linear algebra series
Strang, G. MIT 18.06 Linear Algebra. MIT OpenCourseWare. —
Professor Gilbert Strang's classic course
Extended Reading
Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep
Learning. MIT Press. Chapter 2. — Linear algebra from deep learning
perspective
Crowe, M. J. (1967). A History of Vector Analysis.
University of Notre Dame Press. — Historical evolution of vector
concept
This is Chapter 1 of the "Essence and Applications of Linear
Algebra" series, consisting of 18 chapters.Author: Chen K. |
Last Updated: 2024-01-05For questions or suggestions, feel
free to discuss in the comments!
Post title:Essence of Linear Algebra (1): The Essence of Vectors - More Than Just Arrows
Post author:Chen Kai
Create time:2019-01-05 09:30:00
Post link:https://www.chenk.top/chapter-01-the-essence-of-vectors/
Copyright Notice:All articles in this blog are licensed under BY-NC-SA unless stating additionally.