1. Example of Projection Complement
Scene setting:
- Full space $\mathbb{R}^3$: the entire room.
- Subspace $S$: $xy$ plane (i.e. the floor).
- Orthogonal complement $S^\perp$: $z$ axis (i.e. vertical column).
- Vector $b$: $\begin{bmatrix} 3 \\ 4 \\ 5 \end{bmatrix}$. You can think of it as an arrow pointing 3 meters to the right, 4 meters inward, and 5 meters upward.
1. Construct matrix
A. Projection matrix $P$ (projected to $S$, i.e. the floor) To “slap” a vector to the floor, just keep $x$ and $y$ and change $z$ to 0. The matrix $P$ looks like this:
$$P = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 0 \end{bmatrix}$$B. Complementary matrix $Q = I - P$ (projected onto $S^\perp$, i.e. $z$ axis) We subtract $P$ from the identity matrix:
$$I - P = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix} - \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 0 \end{bmatrix} = \begin{bmatrix} 0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 1 \end{bmatrix}$$You see, the function of the calculated matrix is obviously to “keep only $z$ and change $x, y$ to 0”.
2. Verify the decomposition process ($b = Pb + (I-P)b$)
Now we throw this arrow $b = \begin{bmatrix} 3 \\ 4 \\ 5 \end{bmatrix}$ into these two matrices. Step 1: Find the shadow ($Pb$)
$$Pb = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 0 \end{bmatrix} \begin{bmatrix} 3 \\ 4 \\ 5 \end{bmatrix} = \begin{bmatrix} 3 \\ 4 \\ 0 \end{bmatrix}$$Result: This is the shadow on the floor, with the height ($z$) gone. This belongs to the subspace $S$. Step 2: Find the vertical line ($(I-P)b$)
$$(I-P)b = \begin{bmatrix} 0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} 3 \\ 4 \\ 5 \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \\ 5 \end{bmatrix}$$Result: This is pure height, the horizontal component ($x, y$) is gone. This belongs to the orthogonal complement $S^\perp$. Step Three: Witness the Miracle We add the two parts together:
$$\text{shadow} + \text{vertical line} = \begin{bmatrix} 3 \\ 4 \\ 0 \end{bmatrix} + \begin{bmatrix} 0 \\ 0 \\ 5 \end{bmatrix} = \begin{bmatrix} 3 \\ 4 \\ 5 \end{bmatrix} = \text{original vector } b$$Moreover, you can verify that they are orthogonal:
$$\text{Shadow} \cdot \text{Vertical Line} = (3\times0) + (4\times0) + (0\times5) = 0$$3. Advanced: Why is it called a “master key”?
The above example is very simple (because it is a projection of the coordinate axis). You may think, “Can’t I just look at the coordinates directly? Why do we need a matrix?” But what if the plane is not horizontal? Suppose you want to project onto a slope $S$. The direct formula for $P_S$ is very complicated. However, the normal to the slope (the vertical direction $S^\perp$) is usually just a line. The problem-solving idea becomes: 1 Regardless of the slope, first calculate the projection matrix $P_{line}$ of the simple normal line (the formula is very simple: $\frac{aa^T}{a^Ta}$). 2 Then use $I - P_{line}$. 3 In an instant you get the projection matrix $P_{plane}$ of the complex inclined plane.
⠀This is the real power of $I-P$ in actual calculations: **Convert difficult-to-calculate “surfaces” into easy-to-calculate “lines”. **
2. Why is J in (b) the projection matrix?
This is a very core linear algebra concept, especially important in statistics and data processing. Simply put, these two slides explain why matrix $J$ (a full $1$ matrix divided by $n$) is essentially an “averaging” machine, and this “averaging” process geometrically projects the vector onto a straight line with the direction $\mathbf{1}$.
I will explain why $J$ is a projection matrix from two perspectives: mathematical derivation and intuitive operation.
1. From the perspective of mathematical formulas (the most rigorous explanation)
We want to show that $J$ is a matrix projected onto the vector $\mathbf{1} = [1, 1, ..., 1]^\top$. General formula for projection matrix: If you want to project any vector $b$ to the direction of some vector $a$, the formula of the projection matrix $P$ is:
$$P = \frac{aa^\top}{a^\top a}$$Now, let’s replace $a$ with the full $1$ vector $\mathbf{1}$: 1 molecule ($aa^\top$):
$$\mathbf{1}\mathbf{1}^\top = \begin{bmatrix} 1 \\ \vdots \\ 1 \end{bmatrix} \begin{bmatrix} 1 & \dots & 1 \end{bmatrix} = \begin{bmatrix} 1 & \dots & 1 \\ \vdots & \ddots & \vdots \\ 1 & \dots & 1 \end{bmatrix}$$This is a $n \times n$ matrix, all 1’s in it. 2 Denominator ($a^\top a$, which is the square of the length): The dot product of the vector $\mathbf{1}$ is:
$$\mathbf{1}^\top \mathbf{1} = 1\cdot1 + 1\cdot1 + ... + 1\cdot1 = n$$So, the length of the vector $\mathbf{1}$ is $\sqrt{n}$. 3 Substitute the formula:
$$P = \frac{\mathbf{1}\mathbf{1}^\top}{n} = \frac{1}{n} \begin{bmatrix} 1 & \dots & 1 \\ \vdots & \ddots & \vdots \\ 1 & \dots & 1 \end{bmatrix} = J$$⠀Conclusion: Matrix $J$ perfectly meets the mathematical definition of “projection to vector $\mathbf{1}$”.
2. From the logical perspective of slides (unit vector method)
The slide uses a slightly different entry point, the Unit Vector formula.
- General formula: $P = \frac{aa^\top}{a^\top a}$
- Unit vector formula: If $u$ is already a unit vector of length 1, the denominator is 1 and the formula simplifies to $P = uu^\top$.
⠀The derivation steps in the slide are as follows: 1 The length of vector $\mathbf{1}$ is $\sqrt{n}$. 2 In order to use the simplified formula, it first constructs a unit vector $u$:
$$u = \frac{\mathbf{1}}{\text{length}} = \frac{1}{\sqrt{n}}\mathbf{1}$$3 Definition $uu^\top$:
$$uu^\top = \left(\frac{1}{\sqrt{n}}\mathbf{1}\right) \left(\frac{1}{\sqrt{n}}\mathbf{1}^\top\right) = \frac{1}{\sqrt{n}} \cdot \frac{1}{\sqrt{n}} \cdot \mathbf{1}\mathbf{1}^\top = \frac{1}{n}\mathbf{1}\mathbf{1}^\top = J$$⠀This is the same result as above, just different paths to the same end.
3. From an intuitive operation perspective (what does it do?)
This step will help you fully understand why this is called “projection”. Suppose $n=3$, we have an arbitrary vector $x = \begin{bmatrix} 1 \\ 2 \\ 6 \end{bmatrix}$. We multiply it by the matrix $J$:
$$J x = \frac{1}{3} \begin{bmatrix} 1 & 1 & 1 \\ 1 & 1 & 1 \\ 1 & 1 & 1 \end{bmatrix} \begin{bmatrix} 1 \\ 2 \\ 6 \end{bmatrix}$$Calculation process: 1 First multiply a row of the matrix by the vector: $1\cdot1 + 1\cdot2 + 1\cdot6 = 9$. 2 Because each row is the same, the result of the multiplication is $\begin{bmatrix} 9 \\ 9 \\ 9 \end{bmatrix}$. 3 Finally multiplied by the previous coefficient $\frac{1}{3}$:
$$Jx = \begin{bmatrix} 3 \\ 3 \\ 3 \end{bmatrix}$$⠀**Did you find it? ** *Original data is $1, 2, 6$, average is $3$.
- The function of $J$ is to turn the vector $x$ into $[\text{average}, \text{average}, \text{average}]^\top$.
- This new vector $\begin{bmatrix} 3 \\ 3 \\ 3 \end{bmatrix}$ falls exactly on the line with vector $\mathbf{1} = \begin{bmatrix} 1 \\ 1 \\ 1 \end{bmatrix}$ (it is 3 times larger than $\mathbf{1}$).
This is the geometric meaning of projection:
It forces your original vector $x$ onto this diagonal line (all 1 directions) and finds the point closest to $x$ on the straight line. And this nearest point is a vector composed of average values.
Summarize
1 **What is subspace $S_1$? ** is the straight line formed by vectors with all components equal (because they are all multiples of $\mathbf{1}$). 2 **Why $J$ is the projection matrix? ** Because $J$ converts any vector into a vector on the straight line (i.e., the mean vector). 3 **Why is the dimension 1? ** Because the output result can only run on this straight line, the degree of freedom is only 1.
3. Idempotent,Eigenvalue,Multiplicity,Rank
This picture is about the derivation of Eigenvalues and Multiplicity in linear algebra. In order for you to understand, we don’t need to care about the specific background of the question, we only need to focus on the three core mathematical concepts. I’ll answer your questions in order.
1. Why can the eigenvalues of Idempotent (idempotent matrix) only be 0 or 1?
Idempotent means: this matrix multiplies itself, and the result is equal to itself. Mathematical expression:
$$C^2 = C$$(or $C \cdot C = C$). Intuitive understanding: If you press the switch once and it says “lights on” and press it again and it still says “lights on,” it means that pressing it a second time didn’t work. The projection matrix is a typical idempotent matrix: you project an object into a shadow (the first operation), and if you project it again against the shadow (the second operation), the shadow will not change. Mathematical proof (just three lines): Suppose $\lambda$ is the eigenvalue and $x$ is the corresponding eigenvector ($x \neq 0$). 1 Defined according to eigenvalues: $Cx = \lambda x$ 2 We multiply $C$ on both sides at the same time:
$$C(Cx) = C(\lambda x) \Rightarrow C^2 x = \lambda (Cx)$$3 Since $C^2 = C$, and $Cx = \lambda x$, substitution gives:
$$Cx = \lambda (\lambda x) \Rightarrow \lambda x = \lambda^2 x$$$$\Rightarrow (\lambda^2 - \lambda)x = 0$$⠀Because $x$ is not a zero vector, the coefficient must be 0:
$$\lambda^2 - \lambda = 0 \Rightarrow \lambda(\lambda - 1) = 0$$Conclusion: $\lambda$ can only be equal to 0 or 1.
2. What is Multiplicity?
Simply put, multiplicity is “quantity”. That is, this eigenvalue “appears” several times in the matrix. For example, a $5 \times 5$ matrix has a total of 5 eigenvalues. If the calculated eigenvalues are: $1, 1, 1, 0, 0$.
- We would say: the multiplicity of eigenvalue 1 is 3.
- The multiplicity of eigenvalue 0 is 2.
⠀Special meaning in projection matrix: For the Projection Matrix, the multiplicity has a very clear physical meaning:
- Multiplicity of eigenvalue 1 = Rank
- Meaning: This is the dimension of the projected target space (what is the dimension of the shadow you project?).
- A vector in this space remains unchanged after projection ($Cx = 1x$), so it corresponds to eigenvalue 1.
- Multiplicity of eigenvalue 0 = Dimension of null space (Nullity)
- Meaning: This is the spatial dimension that is “compressed” (perpendicular to the direction of the shadow).
- The vector in this direction becomes 0** ($Cx = 0x$) after being projected, so it corresponds to the eigenvalue 0.
⠀
3. What exactly is this picture saying?
This picture is a summary of the last step, which combines the knowledge of the first two points: 1 Determine the value: Because it is known that $C$ is an idempotent matrix ($C^2=C$), it is first concluded that its eigenvalue can only be 0 or 1. 2 Determine the quantity (Multiplicity): It needs to know how many 1s and how many 0s there are.
- About the number of 1: Because the rank of the matrix $\text{Rank}(C) = n-1$ was calculated before (which means it is projected into a $n-1$ dimensional space), the eigenvalue 1 appears $n-1$ times.
- About the number of 0: Because the total dimension is $n$, the remaining compressed dimensions are $n - (n-1) = 1$ (that is, the zero space dimension), so the eigenvalue 0 appears 1 times. 3 Check: Use the property of trace: sum of matrix diagonals = sum of all eigenvalues. $$(n-1) \times 1 + 1 \times 0 = n-1$$This is consistent with the result of directly calculating the trace of the matrix, which proves that the calculation is correct.
⠀Summary
- Idempotent $\rightarrow$ 0 or 1: Just like projection, cast once and cast twice, the scaling factor can only be 1 (unchanged) or 0 (disappeared).
- Multiplicity: It is the “number” of eigenvalues.
- Logic in the picture: There are $n$ eigenvalues in total. Since the rank is $n-1$, it means that $n-1$ eigenvalues are 1, and the remaining 1 eigenvalue is 0.
4. The sum of the diagonal contents of the square matrix is the sum of Eigenvalue
This property is true for any square matrix (Square Matrix, that is, $n \times n$ matrix). It does not require that the matrix be “symmetric”, “full rank”, or “diagonalizable”. As long as it is a square formation, it must be established. However, in order for the equation to strictly hold, you need to obey two “hidden rules” (also known as subtle conditions):
1. “Complex” eigenvalues must be counted
Even if your matrix is all real numbers, the eigenvalues may be complex (imaginary) numbers. Example: Rotation matrix (rotate 90 degrees)
$$A = \begin{bmatrix} 0 & -1 \\ 1 & 0 \end{bmatrix}$$- Sum of diagonals (Trace): $0 + 0 = 0$.
- Eigenvalue: $\lambda_1 = i, \quad \lambda_2 = -i$.
- Sum of eigenvalues: $i + (-i) = 0$.
- Conclusion: Established. If you only look for eigenvalues in the real range (you would think there are no eigenvalues), then the equation will not hold.
⠀2. Algebraic Multiplicity must be counted If an eigenvalue turns out to be a “double root”, you must add it twice when summing. Example: Shear Matrix
$$B = \begin{bmatrix} 1 & 1 \\ 0 & 1 \end{bmatrix}$$This is a non-diagonalizable matrix (Defective Matrix).
- Sum of diagonals (Trace): $1 + 1 = 2$.
- Eigenvalue: It is an upper triangular matrix. The eigenvalues look directly at the diagonal, which are $1$ and $1$.
- Sum of Eigenvalues: Although there is only one “unique” eigenvalue 1, its algebraic multiplicity is 2. So $1 + 1 = 2$.
- Conclusion: Established. If you only add it once, something is wrong.
⠀
Why does it always hold true? (Simple mathematical intuition)
The underlying reason for this property is Vieta’s formulas. 1 To find the eigenvalues is to solve the equation: $\det(A - \lambda I) = 0$. 2 This is a degree $n$ polynomial in $\lambda$:
$$c_n \lambda^n + c_{n-1} \lambda^{n-1} + \dots + c_0 = 0$$3 After this polynomial expansion, the coefficient of $\lambda^{n-1}$ is exactly determined by the sum (Trace) of the diagonal elements of the matrix. 4 According to Vedic theorem, the sum of polynomial roots ($\sum \lambda_i$) is also equal to the coefficient of this term (taking the negative sign and other relations).
⠀So, this is not just a coincidence, but an inevitable result determined by the polynomial structure.
Summarize
- **Does it have to be a square array? ** Yes. (A rectangular matrix has no diagonal and no eigenvalues).
- **Does the matrix need to be a symmetric matrix? ** unnecessary.
- **Does the matrix need to be invertible? ** unnecessary.
- **Does the matrix need to be diagonalizable? ** unnecessary.
As long as you calculate in the complex domain and include multiplicity, the trace will always be equal to the sum of the eigenvalues. **
5. Characteristic vector in the context of question (d)
$$\underbrace{A}_{\text{Matrix}} \cdot \underbrace{v}_{\text{Vector}} = \underbrace{\lambda}_{\text{Scalar}} \cdot \underbrace{v}_{\text{Vector}}$$- $A$ is the “machine” (The Matrix): It is a square matrix of $n \times n$. It is the sender of actions. It is responsible for transforming the vector $v$ (rotation, stretching, projection, etc.).
- $v$ is “the special vector” (The Eigenvector): It is the object handled by $A$. What’s special about it is that after being processed by $A$, the direction does not change, but the length changes.
- $\lambda$ is “the multiple” (The Eigenvalue): It is a simple number (scalar). It represents how many times $v$ is stretched or compressed by $A$.
“$A$ is the matrix with eigenvalues $\lambda$.” or “$\lambda$ is the scaling of matrix $A$ with respect to vector $v$.”
In the picture you uploaded, the matrix in question is $C$ (that idempotent matrix). Let’s apply the standard formula $Av = \lambda v$ to your problem:
- **What is $A$ here? ** is the matrix $C$ (or $I-J$ in the slide).
- **What is $v$ here? ** The example in the picture uses the all-1 vector $\mathbf{1}$.
- **What is $\lambda$ here? ** The image calculates to $0$.
The corresponding equation is:
$$C \cdot \mathbf{1} = 0 \cdot \mathbf{1}$$Translated into adult language:
- Matrix ($A$): $C$
- Actions on vector ($v$): $\mathbf{1}$
- Result: The vector does not change direction, but its length becomes 0 times (that is, it becomes a zero vector).
- Eigenvalue ($\lambda$): This is the number 0.
Do you understand the difference?