Essence of Linear Algebra (3): Matrices as Linear Transformations
Chen Kai BOSS

In the previous two chapters, we established the concepts of vectors and vector spaces. If vectors are the "residents" of space, then matrices are the "magic" that changes this space. Today we'll reveal the true identity of matrices: a matrix is not a table of numbers arranged in rows and columns — it's a way of transforming space. This shift in perspective will fundamentally change your understanding of linear algebra.

From Number Tables to Space Transformations: A Cognitive Revolution

Open any traditional linear algebra textbook, and matrices are usually introduced like this:

"This is amatrix, consisting of 4 elements. Matrices can undergo addition, scalar multiplication, and multiplication..."

This introduction, while correct, is like telling you that a car has four wheels and a steering wheel without telling you that cars are for transportation. You might learn the rules for matrix multiplication without understanding why it's defined this way, why matrix multiplication isn't commutative, or why anmatrix can only multiply with anmatrix.

Now, let me tell you what matrices really mean.

A Matrix Is a Function

Matrixis essentially a function (or mapping, or transformation). It takes a vector as input and outputs another vector:You can think of a matrix as a "vector processing machine": put in a raw vector, and out comes a processed vector.

Life Analogy: The Zoom Function on a Copier

Imagine a copier with a "zoom" dial. When you set it to 150%, the copy is 1.5 times larger than the original; at 50%, the copy shrinks by half. This zoom function is a type of "transformation"— it turns the original image into a new image.

What matrices do is similar, but much richer: they can not only scale but also rotate, shear, reflect, and even "flatten" a 3D object into a 2D image (projection).

Not All Transformations Are "Linear Transformations"

Matrices don't represent arbitrary transformations — they represent a special class called linear transformations.

A transformationis linear if and only if it satisfies two conditions:

Condition 1: Additivity

Condition 2: HomogeneityThese two conditions can be combined into a more concise form:

Geometric Understanding

Linear transformations have three notable geometric characteristics:

  1. The origin stays fixed:. No matter how you transform, the origin is always at the origin.
  2. Lines remain lines: A line before transformation remains a line after transformation (it doesn't bend).
  3. Parallel lines stay parallel: Two parallel lines remain parallel after transformation; the spacing may change, but the ratio is preserved.

Life Analogy: Stretching a Rubber Sheet

Imagine a rubber sheet with a grid drawn on it, pinned at the origin. You can stretch it, rotate it, or tilt it, but you can't tear or fold it. Such transformations are linear: the grid lines remain straight, parallel lines remain parallel, and the origin stays in place.

Counterexamples: What Are NOT Linear Transformations?

  • Translation:. Translation moves the origin, so it's not a linear transformation.
  • Bending: Transformations that turn straight lines into curves are not linear.
  • Projection onto curved surfaces: For example, projecting a plane onto a sphere.

Matrix Columns: The "Destination" of Basis Vectors

Now for the most crucial insight: the columns of a matrix tell us where the basis vectors go.

The Fate of Standard Basis Vectors Determines Everything

In 2D space, the standard basis vectors are:Any vectorcan be written as:Now, if we know that linear transformationtransformsintoandinto, what doesbecome?

Using the properties of linear transformations:

Amazing! Just by knowing where the basis vectors go, we can calculate where any vector goes!

Matrix Columns Are the Transformed Basis Vectors

Suppose:Then the matrix corresponding to transformationis:

The first column of the matrix is wherelands, and the second column is wherelands.

Example Calculation

Suppose after transformation: -becomes -becomesThe matrix is:Now compute the image of:This is the embodiment of linear combinations: new vector = 2 times (new) + 3 times (new).

Q: Why Columns Instead of Rows?

This is a matter of history and convention. The reason we store transformed basis vectors in columns is to make matrix-vector multiplication naturally represent "linear combinations." If vectors are written as column vectors, thenresults in a weighted sum of's columns according to's components. This convention gives matrix multiplication a clear geometric meaning.

Matrix Representations of Common Linear Transformations

Now let's look at several of the most common linear transformations and their corresponding matrices.

Rotation

Problem: What is the matrix for counterclockwise rotation by angle?

Derivation: Track where the basis vectors go.corresponds to angleon the unit circle. After rotating by, the angle becomes, so: corresponds to angle(i.e.,) on the unit circle. After rotating by, the angle becomes, so:Therefore, the rotation matrix is:

Special Cases:

  • Rotate:
  • Rotate:
  • Rotate(or): Life Case: Game Character Turning

In 2D games, when a player presses a direction key to turn the character, the program needs to rotate the character's facing vector. If the character was initially facing(right), and the player presses "up" to rotate counterclockwise by, the new facing is:The character now faces upward.

Scaling

Problem: What is the matrix for scaling byalong the-axis andalong the-axis?

Derivation: -becomes -becomesScaling matrix:

Special Cases:

  • Uniform scaling by:
  • Stretch by 2 along-axis:
  • Compress by half along-axis:

Life Case: Image Resizing

When you resize an image in photo editing software, the software applies a scaling transformation to each pixel's coordinates. If you shrink animage to, the scaling factors are, and each pixelbecomes.

Shear

Shear is a "tilting" transformation that keeps one direction fixed while "pushing" along another direction.

Horizontal shear (alongdirection):This transformation keeps the-coordinate unchanged but increases the-coordinate by.

Vertical shear (alongdirection):

Life Case: Italic Text

When word processing software turns regular text into italics, it uses a shear transformation. The bottom of the letter stays fixed while the top tilts to the right. If the italic angle is, the shear coefficient.

Case: Grass in the Wind

Imagine a meadow where grass originally grows vertically upward. When wind blows from the right, the grass tilts to the left, but the roots (ground level) don't move. This is the effect of shear: the higher a point is from the ground, the greater its horizontal displacement.

Reflection

Reflection about the-axis: -stays unchanged -becomes Reflection about the-axis:

Reflection about the origin (equivalent torotation):

Reflection about the line:This matrix swapsandcoordinates: pointbecomes.

Reflection about arbitrary line:

Through derivation (rotate to align the line with the-axis, reflect about-axis, rotate back):

Life Case: Yourself in a Mirror

When you look in a mirror, the image you see is a reflection of the original object about the mirror surface. If the mirror is vertical (along the-axis), your left hand becomes your right hand in the image (-coordinate negated), but height is unchanged (-coordinate unchanged).

Projection

Projection onto the-axis:This "flattens" all vectors onto the-axis:.

Projection onto the-axis:

Projection onto the line:

Life Case: Shadows

At noon when the sun is directly overhead, your shadow is your projection onto the ground (the-plane). If we simplify a person as a set of points in 3D space, the shadow is the figure obtained by setting each point's-coordinate to 0.

Summary Table of Transformation Matrices

Transformation Type Matrix Effect
Rotation by Counterclockwise rotation by
Scaling Scale along coordinate axes
Horizontal shear Tilt in horizontal direction
Reflection about-axis Flip vertically
Reflection about-axis Flip horizontally
Projection onto-axis Flatten onto-axis

Matrix Multiplication: Composition of Transformations

The Problem of Sequential Transformations

Suppose we want to apply transformationto a vector, then transformation. What's the result?

First apply:Then apply:Using the associativity of matrix multiplication:

Conclusion: The composite transformation offollowed bycorresponds to matrix(note the order!).

WhyInstead of?

This is because we write vectors as column vectors, and matrices multiply vectors from the left.meansacts first,meansacts next. Reading from inside to outside: first, then, the composite matrix is.

Memory aid: Matrix multiplication reads right to left.means "first, then, finally."

Geometric Meaning of Matrix Multiplication

What are the columns of matrix?

First column of=acting on first column of=acting on "'s transformed"

Second column of=acting on second column of=acting on "'s transformed"

In other words: The columns of composite transformationare the basis vectors transformed byand then further transformed by.

Example: Rotate Then Scale

Let the matrix forrotation be:And the matrix for stretching by 2 along the-axis be:

Rotate first, then scale ():

Scale first, then rotate ():

Matrix multiplication is not commutative.

Geometric Explanation:

  • Rotate first, then scale: A square rotatesto a diamond orientation, then stretches along the-axis.
  • Scale first, then rotate: A square stretches along the-axis into a rectangle, then the entire rectangle rotates.

The final shapes are different!

Associativity of Matrix Multiplication

Although matrix multiplication isn't commutative, it does satisfy associativity:

Geometric Explanation: No matter how you group them, the final sequence of transformations is the same. Bothandrepresent "first, then, finally."

Formal Proof:

For any vector:Since this holds for all,.

Practical Significance of Associativity:

When you need to apply the same series of transformations to many vectors (like a million pixels), you can first multiply all the transformation matrices together to get a single total matrix, then use this total matrix to transform all vectors at once. This is much faster than applying transformations one by one.

For example: In 3D games, an object might need: 1. Scaling () 2. Rotation () 3. Translation (handled with homogeneous coordinates)

Instead of applying three transformations to each vertex, compute(note the order) first, then do just one multiplication per vertex.

Image Transformation Practical Cases

Let's actually manipulate image transformations with Python.

Case 1: Rotating an Image

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image

def rotate_image(image, angle_degrees):
"""Rotate image (counterclockwise)"""
angle = np.radians(angle_degrees)

# Rotation matrix
R = np.array([
[np.cos(angle), -np.sin(angle)],
[np.sin(angle), np.cos(angle)]
])

h, w = image.shape[:2]
center = np.array([w/2, h/2])

# Create output image
output = np.zeros_like(image)

for y in range(h):
for x in range(w):
# Coordinates relative to center
pos = np.array([x, y]) - center
# Inverse transform to find source coordinates
src_pos = np.linalg.inv(R) @ pos + center
src_x, src_y = int(src_pos[0]), int(src_pos[1])

if 0 <= src_x < w and 0 <= src_y < h:
output[y, x] = image[src_y, src_x]

return output

# Usage example
# img = np.array(Image.open('photo.jpg'))
# rotated = rotate_image(img, 45)
# plt.imshow(rotated)

Key point: In image transformations, we typically use "inverse mapping"— for each pixel in the output image, we calculate its corresponding position in the original image and get that position's color. This avoids holes that can occur with forward mapping.

Case 2: Real-time 2D Game Transformations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
import numpy as np

class Transform2D:
"""2D game object transformation class"""

def __init__(self):
self.position = np.array([0.0, 0.0])
self.rotation = 0.0 # Radians
self.scale = np.array([1.0, 1.0])

def get_matrix(self):
"""Get combined transformation matrix (3x3 homogeneous coordinates)"""
# Scaling matrix
S = np.array([
[self.scale[0], 0, 0],
[0, self.scale[1], 0],
[0, 0, 1]
])

# Rotation matrix
c, s = np.cos(self.rotation), np.sin(self.rotation)
R = np.array([
[c, -s, 0],
[s, c, 0],
[0, 0, 1]
])

# Translation matrix
T = np.array([
[1, 0, self.position[0]],
[0, 1, self.position[1]],
[0, 0, 1]
])

# Combine: scale first, then rotate, finally translate
return T @ R @ S

def transform_point(self, point):
"""Transform a point"""
p = np.array([point[0], point[1], 1])
result = self.get_matrix() @ p
return result[:2]

def transform_points(self, points):
"""Batch transform multiple points"""
# Convert to homogeneous coordinates
n = len(points)
homogeneous = np.ones((3, n))
homogeneous[:2, :] = np.array(points).T

# Transform all points at once
result = self.get_matrix() @ homogeneous
return result[:2, :].T

# Usage example
transform = Transform2D()
transform.position = np.array([100, 50])
transform.rotation = np.pi / 4 # 45 degrees
transform.scale = np.array([2.0, 1.5])

# Transform the four vertices of a square
square = [[-1, -1], [1, -1], [1, 1], [-1, 1]]
transformed_square = transform.transform_points(square)
print(transformed_square)

Case 3: Image Shear Effect

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
def shear_image(image, shear_x=0, shear_y=0):
"""Apply shear transformation to image"""
h, w = image.shape[:2]

# Shear matrix
shear_matrix = np.array([
[1, shear_x],
[shear_y, 1]
])

# Calculate output dimensions
corners = np.array([[0, 0], [w, 0], [w, h], [0, h]]).T
new_corners = shear_matrix @ corners

min_x, max_x = new_corners[0].min(), new_corners[0].max()
min_y, max_y = new_corners[1].min(), new_corners[1].max()

new_w = int(max_x - min_x)
new_h = int(max_y - min_y)

output = np.zeros((new_h, new_w, *image.shape[2:]), dtype=image.dtype)

inv_shear = np.linalg.inv(shear_matrix)

for y in range(new_h):
for x in range(new_w):
src = inv_shear @ np.array([x + min_x, y + min_y])
src_x, src_y = int(src[0]), int(src[1])

if 0 <= src_x < w and 0 <= src_y < h:
output[y, x] = image[src_y, src_x]

return output

Transformations in 3D Space

Everything we've discussed has been about 2D transformations, but the same ideas extend to 3D (and higher dimensions).

3D Rotation Matrices

Rotation about the-axis by:

Rotation about the-axis by:

Rotation about the-axis by:

3D Scaling Matrix

Projection onto a Plane

Orthographic projection onto the-plane (discards-coordinate):

Perspective projection (used for 3D graphics rendering) is more complex, involving homogeneous coordinates and nonlinear transformations, which we won't detail here.

Inverse Matrices: Undoing Transformations

What Is an Inverse Matrix?

If matrixrepresents a transformation, then the inverse matrixrepresents "undoing" that transformation:Whereis the identity matrix, representing the "do nothing" transformation.

Examples: - The inverse of rotation byis rotation by: - The inverse of scaling byis scaling by(provided) - The inverse of a reflection is itself: reflecting twice returns to the original

When Does a Matrix Have an Inverse?

Not all transformations can be undone.

Example: The projection matrixonto the-axis has no inverse. Because projection flattens all points onto a line, information is lost. Bothandbecome, and we can't distinguish which they originally were.

Invertibility Condition:

Transformation is invertibleTransformation doesn't "reduce dimension"DeterminantWe'll discuss determinants in detail in the next chapter.

Formula for Inverse of aMatrix

For, if, then:Whereis the determinant of matrix.

Kernel and Image of Linear Transformations

Kernel: Vectors Transformed to the Origin

The kernel (or null space) of transformationis defined as:

Example: The kernel of projection onto the-axis is the-axis (all points on the-axis are projected to the origin).

Image: The Set of All Output Vectors

The image (or range) of transformationis defined as:

Example: The image of projection onto the-axis is the-axis itself.

Rank-Nullity TheoremThis tells us: if a transformation "flattens" some dimensions (kernel has more than just the zero vector), then the dimension of its image decreases correspondingly.

Frequently Asked Questions

Q1: Is Translation a Linear Transformation?

No. Translationmoves the origin:. In computer graphics, to represent translation with matrices, we introduce homogeneous coordinates, representing 2D vectoras 3D vector, so translation can be expressed with amatrix.

Q2: Why Is Matrix Multiplication Defined This Way?

Matrix multiplication is defined precisely so that "product of matrices" corresponds to "composition of transformations." Ifrepresents transformationandrepresents transformation, thenrepresents the composite transformation "first, then." The multiplication rules are derived from this goal.

Q3: Why Are Rotation Matrices So Special?

Rotation matrices preserve length and angle (they're orthogonal transformations), and their determinant is 1 (preserving orientation, no reflection). This class of matrices forms thegroup (2D special orthogonal group), with nice mathematical properties.

Q4: How Are Transformations Combined in Practice?

In game engines or graphics software, the typical order is: scale → rotate → translate. This is called "TRS" order (Transform = Translate × Rotate × Scale). Note that because matrices act from right to left, in matrix multiplication we writefirst, then, finally.

Exercises

Basic Problems

Problem 1: Matrixrepresents what geometric transformation? Draw the unit square before and after transformation.

Problem 2: Write the matrix for reflection about the-axis. Verify that it transforms pointto the correct position.

Problem 3: Calculate the square of rotation matrixand verify it equals.

Problem 4: Matrixrepresents what transformation? Transform the four vertices of the unit squareand draw the result.

Problem 5: Prove that the identity transformationsatisfiesfor any vector.

Intermediate Problems

Problem 6: Find amatrixthat first reflects about the-axis, then rotates counterclockwise by.

Problem 7: Prove that the inverse of rotation matrixis, i.e.,.

Problem 8: Prove that ifandare both invertible matrices, then. (Hint: Verify)

Problem 9: Let, compute. What pattern do you notice?

Problem 10: Prove: If matrixsatisfies, thenis its own inverse. Give three such matrices (other thanand).

Problem 11: Letbe the rotation matrix for angleandbe the uniform scaling matrix by factor. Prove(these two transformations can be done in either order). Explain the geometric reason.

Proof Problems

Problem 12: Prove that matrix multiplication satisfies associativity. (Hint: Prove both sides are equal using component form)

Problem 13: Prove that ifis a linear transformation, thenmaps the origin to the origin:.

Problem 14: Prove that the composition of two linear transformations is still a linear transformation. That is, ifandare both linear, thenis also linear.

Problem 15: Prove that rotation matrixsatisfies, whereis the transpose. This shows rotation matrices are orthogonal matrices.

Programming Problems

Problem 16: Write a Python function that takes amatrix and outputs a visualization of its transformation effect on the unit square (use matplotlib to draw the square before and after transformation).

Problem 17: Create an animation showing the motion trajectory of vectoras the rotation angle varies fromto.

Problem 18: Write an image rotation function that supports arbitrary angle rotation and uses bilinear interpolation to avoid aliasing.

Problem 19: Implement a simple 2D particle system where each particle has position, velocity, rotation angle, and size. Use matrix transformations to update and render particles.

Problem 20: Implement an interactive program that lets users adjust rotation angle, scaling factors, and shear coefficients via sliders, displaying real-time transformation effects on an image.

Thinking Problems

Problem 21: Why do computer graphics usematrices to represent 3D transformations instead ofmatrices?

Problem 22: In machine learning, each layer of a neural network can be viewed as a linear transformation (matrix multiplication) plus a nonlinear activation function. Why is the nonlinear activation function needed? What would happen to a multi-layer neural network without it?

Problem 23: Satellite images typically require geometric correction before use. What types of transformations does this correction involve? Why might simple linear transformations be insufficient?

Chapter Summary

This chapter revealed the true identity of matrices: matrices are representations of linear transformations.

Core Concepts: 1. Matrixis a function that transforms vectorinto$ABAABA^{-1}A$ Why This Perspective Matters: - It explains the origin of matrix multiplication rules - It lets us "see" what a matrix does - It's the foundation for computer graphics, physics, machine learning, and more

Next Chapter Preview: "The Secrets of Determinants"— We'll see how determinants measure the effect of transformations on area/volume, and why they can determine whether a matrix is invertible.


Next Chapter: "The Secrets of Determinants"

Previous Chapter: ← "Linear Combinations and Vector Spaces"


This is Chapter 3 of the 18-chapter "Essence of Linear Algebra" series.

  • Post title:Essence of Linear Algebra (3): Matrices as Linear Transformations
  • Post author:Chen Kai
  • Create time:2019-01-15 10:45:00
  • Post link:https://www.chenk.top/chapter-03-matrices-as-linear-transformations/
  • Copyright Notice:All articles in this blog are licensed under BY-NC-SA unless stating additionally.
 Comments