Defective Matrices and the Jordan Canonical Form

Non-diagonalizable Systems Matrix Exponential

The solution methodology for systems of linear ordinary differential equations:

\dot{\mathbf{x}} = \mathbf{A} \mathbf{x}

relies on the diagonalizability of the matrix \mathbf{A}.

When \mathbf{A} possesses a full set of linearly independent eigenvectors, it can be diagonalized, and the solution is a pure superposition of exponential modes.

However, when the matrix has repeated eigenvalues, it may lack a complete basis of eigenvectors. Such matrices are termed defective or non-diagonalizable.

Their analysis requires an extension of the eigenvalue problem and leads to solutions containing terms that grow polynomially in time, known as secular terms.

Non-diagonalizable systems matrix exponential

The formal solution to any linear system remains:

\mathbf{x}(t) = e^{\mathbf{A}t} \mathbf{x}(0)

The challenge lies in computing the matrix exponential e^{\mathbf{A}t} when \mathbf{A} cannot be diagonalized.

A technique used involves decomposing \mathbf{A} into a sum of commuting matrices.

A fundamental property of the matrix exponential, which follows from the structure of its Taylor series expansion, is that for any two matrices \mathbf{S} and \mathbf{T} that commute (\mathbf{ST} = \mathbf{TS}), the exponential of their sum is the product of their exponentials.

e^{\mathbf{S}+\mathbf{T}} = e^{\mathbf{S}} e^{\mathbf{T}} \iff \mathbf{ST} = \mathbf{TS}

This can be proven using the binomial theorem and the fact that \mathbf{S} \mathbf{T} = \mathbf{T} \mathbf{S}:

\begin{aligned} & (\mathbf{S}+\mathbf{T})^n = n! \sum_{j+k=n} \dfrac{\mathbf{S}^j\mathbf{T}^k}{j!k!} \\ & e^{\mathbf{M}} = \mathbf{I} + \mathbf{M} + \frac{1}{2}\mathbf{M}^2 + \frac{1}{3!} \mathbf{M}^3 + \cdots \\ & e^{(\mathbf{S+T})} = \mathbf{I} + (\mathbf{S+T}) + \frac{1}{2}(\mathbf{S+T})^2 + \frac{1}{3!} (\mathbf{S+T})^3 + \cdots \\ & = \mathbf{I} + \left(1! \sum_{j+k=1} \dfrac{\mathbf{S}^j\mathbf{T}^k}{j!k!}\right) + \frac{1}{2} \left(2! \sum_{j+k=2} \dfrac{\mathbf{S}^j\mathbf{T}^k}{j!k!}\right) + \frac{1}{3!} \left(3! \sum_{j+k=3} \dfrac{\mathbf{S}^j\mathbf{T}^k}{j!k!}\right) + \cdots \\ &= \mathbf{I} + \left(\mathbf{S} + \mathbf{T} \right) + \left( \dfrac{\mathbf{S}^2}{2!} + \mathbf{S}\mathbf{T} +\dfrac{\mathbf{T}^2}{2!} \right) + \left( \dfrac{\mathbf{S}^3}{3!} + \dfrac{\mathbf{S}^2\mathbf{T}}{2!} +\dfrac{\mathbf{S}\mathbf{T}^2}{2!} + \dfrac{\mathbf{T}^3}{3!} \right) + \cdots \\ & = \mathbf{I} + \mathbf{S} + \frac{1}{2}\mathbf{S}^2 + \frac{1}{3!} \mathbf{S}^3 +\mathbf{T} + \mathbf{S}\mathbf{T} + \frac{1}{2}\mathbf{S}^2\mathbf{T} + \cdots \\ & e^{\mathbf{S}}e^{\mathbf{T}} = \left( \mathbf{I} + \mathbf{S} + \frac{1}{2}\mathbf{S}^2 + \frac{1}{3!} \mathbf{S}^3 + \cdots \right) \left( \mathbf{I} + \mathbf{T} + \frac{1}{2}\mathbf{T}^2 + \frac{1}{3!} \mathbf{T}^3 + \cdots \right) \\ & = \mathbf{I} + \mathbf{S} + \frac{1}{2}\mathbf{S}^2 + \frac{1}{3!} \mathbf{S}^3 +\left( \mathbf{I} + \mathbf{S} + \frac{1}{2}\mathbf{S}^2 + \frac{1}{3!} \mathbf{S}^3 \right) \mathbf{T} + \left( \mathbf{I} + \mathbf{S} + \frac{1}{2}\mathbf{S}^2 + \frac{1}{3!} \mathbf{S}^3 \right) \frac{1}{2}\mathbf{T}^2 + \cdots \\ & = \mathbf{I} + \mathbf{S} + \frac{1}{2}\mathbf{S}^2 + \frac{1}{3!} \mathbf{S}^3 +\mathbf{T} + \mathbf{S}\mathbf{T} + \frac{1}{2}\mathbf{S}^2\mathbf{T} + \cdots \end{aligned}

In the above, it was considered that:

\begin{aligned} (\mathbf{S+T})^2 & = \mathbf{S}^2 + \mathbf{ST} + \mathbf{TS} + \mathbf{T}^2 \\ & = \mathbf{S}^2 + \mathbf{2ST} + \mathbf{T}^2 \iff \mathbf{ST} = \mathbf{TS} \end{aligned}

which (in general) isn’t true.

Let us analyze the canonical example of a 2 \times 2 non-diagonalizable matrix, a Jordan block:

\mathbf{A} = \begin{bmatrix} \lambda & 1 \\ 0 & \lambda \end{bmatrix}

This matrix has a single eigenvalue \lambda with an algebraic multiplicity of two. We can decompose \mathbf{A} into a diagonal part \mathbf{S} and a nilpotent part \mathbf{N}.

\mathbf{A} = \begin{bmatrix} \lambda & 0 \\ 0 & \lambda \end{bmatrix} + \begin{bmatrix} 0 & 1 \\ 0 & 0 \end{bmatrix} = \mathbf{S} + \mathbf{N}

The matrix \mathbf{S} = \lambda\mathbf{I} is a scalar matrix and therefore commutes with any matrix. Thus, \mathbf{S}\mathbf{N} = \mathbf{N}\mathbf{S}, and we can write e^{\mathbf{A}t} = e^{(\mathbf{S}+\mathbf{N})t} = e^{\mathbf{S}t}e^{\mathbf{N}t}.

The exponential of the diagonal part is straightforward:

e^{\mathbf{S}t} = e^{\lambda\mathbf{I}t} = \begin{bmatrix} e^{\lambda t} & 0 \\ 0 & e^{\lambda t} \end{bmatrix}

The matrix \mathbf{N} is nilpotent because \mathbf{N}^2 = \mathbf{0}. This property causes its Taylor series to truncate after the linear term:

e^{\mathbf{N}t} = \mathbf{I} + \mathbf{N}t + \frac{1}{2!}(\mathbf{N}t)^2 + \cdots = \mathbf{I} + \mathbf{N}t = \begin{bmatrix} 1 & t \\ 0 & 1 \end{bmatrix}

Combining these results gives the solution matrix:

e^{\mathbf{A}t} = e^{\mathbf{S}t}e^{\mathbf{N}t} = \begin{bmatrix} e^{\lambda t} & 0 \\ 0 & e^{\lambda t} \end{bmatrix} \begin{bmatrix} 1 & t \\ 0 & 1 \end{bmatrix} = \begin{bmatrix} e^{\lambda t} & t e^{\lambda t} \\ 0 & e^{\lambda t} \end{bmatrix}

The resulting solution for \mathbf{x}(t) is:

\mathbf{x}(t) = \begin{bmatrix} e^{\lambda t} & t e^{\lambda t} \\ 0 & e^{\lambda t} \end{bmatrix} \mathbf{x}(0)

The term t e^{\lambda t} is a secular term. Its presence signifies that even if the real part of \lambda is negative, ensuring long-term decay, the linear factor t can cause significant transient growth before the exponential decay dominates. This is another mechanism for transient amplification in stable systems, distinct from the non-orthogonality of eigenvectors in non-normal systems.

Geometric and algebraic multiplicity

The diagonalizability of a matrix is determined by the relationship between two concepts of multiplicity for its eigenvalues.

The algebraic multiplicity of an eigenvalue \lambda is its multiplicity as a root of the characteristic polynomial \det(\mathbf{A}-\lambda\mathbf{I})=0.

The geometric multiplicity of an eigenvalue \lambda is the dimension of its corresponding eigenspace, which is the null space of (\mathbf{A}-\lambda\mathbf{I}). This dimension is equal to the maximum number of linearly independent eigenvectors that can be found for that eigenvalue.

A fundamental theorem of linear algebra states that a matrix is diagonalizable if and only if, for every eigenvalue, its geometric multiplicity is equal to its algebraic multiplicity.

Consider the two matrices:

\begin{aligned} & \mathbf{A}_1 = \begin{bmatrix} \lambda & 0 \\ 0 & \lambda \end{bmatrix} \\ & \mathbf{A}_2 = \begin{bmatrix} \lambda & 1 \\ 0 & \lambda \end{bmatrix} \end{aligned}

Both have the characteristic equation (\lambda'-\lambda)^2=0, so for both, the eigenvalue \lambda has an algebraic multiplicity of two (\lambda_{1,2}=\lambda).

Let’s compute the eigenvectors. In the first case:

\begin{aligned} & \left(\mathbf{A}_1 - \lambda \, \mathbf{I}\right) \, \mathbf{x} = 0 \\ & \begin{bmatrix} 0 & 0 \\ 0 & 0 \end{bmatrix} \begin{bmatrix} x_{11} \\ x_{12} \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \end{bmatrix} \end{aligned}

This equation is valid \forall \mathbf{x}, it is possible to chose the vectors \xi_1 = [1,0]^T and \xi_2 = [0,1]^T.

In the second case:

\begin{aligned} & \left(\mathbf{A}_2 - \lambda \, \mathbf{I}\right) \, \mathbf{x} = 0 \\ & \begin{bmatrix} 0 & 1 \\ 0 & 0 \end{bmatrix} \begin{bmatrix} x_{11} \\ x_{12} \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \end{bmatrix} \end{aligned}

In this case, only one eigenvector \xi_1 = [0,1]^T satisfy this system, it is not possible to find another and therefore the matrix is not diagonalizable.

It is possible only to find a vector which is in the null vector of

\left(\mathbf{A}_2 - \lambda \, \mathbf{I}\right)^2 \, \mathbf{x} = 0

which will give a generalized eigenvector of the matrix \mathbf{A}_2.

\begin{aligned} & \left(\mathbf{A}_2 - \lambda \, \mathbf{I}\right)^2 \, \mathbf{x} = 0 \\ & \begin{bmatrix} 0 & 1 \\ 0 & 0 \end{bmatrix} \begin{bmatrix} 0 & 1 \\ 0 & 0 \end{bmatrix} \begin{bmatrix} x_{21} \\ x_{22} \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \end{bmatrix} \\ & \begin{bmatrix} 0 & 0 \\ 0 & 0 \end{bmatrix} \begin{bmatrix} x_{21} \\ x_{22} \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \end{bmatrix} \end{aligned}

Therefore, the generalized eigenvector is \xi_2 = [0,0]^T

Jordan canonical form

When a matrix is defective, a full basis of eigenvectors does not exist. However, it is always possible to find a complete basis of generalized eigenvectors.

A generalized eigenvector corresponding to an eigenvalue \lambda is a non-zero vector \mathbf{v} that satisfies:

(\mathbf{A}-\lambda\mathbf{I})^k \mathbf{v} = \mathbf{0}

for some integer k \ge 1. The set of all such vectors for a given \lambda forms the generalized eigenspace.

The Jordan Canonical Form (JCF) theorem states that any square matrix \mathbf{A} over an algebraically closed field (such as \mathbb{C}) is similar to a block diagonal matrix \mathbf{J}, called its Jordan form:

\mathbf{A} = \mathbf{P} \mathbf{J} \mathbf{P}^{-1}

The matrix \mathbf{P} contains the generalized eigenvectors of \mathbf{A}, and \mathbf{J} is composed of Jordan blocks on its diagonal.

The structure of \mathbf{J} fully describes the eigensystem of \mathbf{A}.

For a real matrix \mathbf{A}, the real Jordan form consists of blocks corresponding to its real and complex conjugate eigenvalues.

A general n \times n real matrix can have eigenvalues falling into four categories, each corresponding to a specific structure within its real Jordan form \mathbf{J}.

Distinct real eigenvalues: each distinct real eigenvalue \lambda_i corresponds to a 1 \times 1 block [\lambda_i] on the diagonal:

\mathbf{J}_1 = \begin{bmatrix} \lambda_1 & 0 & \cdots & 0 \\ 0 & \lambda_2 & \cdots & 0 \\ 0 & 0 & \ddots & 0 \\ 0 & 0 & \cdots & \lambda_m \end{bmatrix}

Complex conjugate eigenvalues: each pair of complex conjugate eigenvalues \alpha_k \pm i\omega_k corresponds to a 2 \times 2 block of the form:

\mathbf{J}_2 = \begin{bmatrix} \lambda_{m+1} & \omega_1 & 0 & 0 & 0 \\ \omega_1 & \lambda_{m+1} & 0 & 0 & 0 \\ 0 & 0 & \ddots & 0 & 0 \\ 0 & 0 & \cdots & \lambda_{n} & \omega_{n-m-1} \\ 0 & 0 & \cdots & \omega_{n-m-1} & \lambda_{n} \end{bmatrix}

Repeated eigenvalues, non-defective: a real eigenvalue \mu_j with algebraic multiplicity m and geometric multiplicity m corresponds to m separate 1 \times 1 blocks [\mu_j]:

\mathbf{J}_3 = \begin{bmatrix} \mu_1 & 0 & \cdots & 0 \\ 0 & \mu_2 & \cdots & 0 \\ 0 & 0 & \ddots & 0 \\ 0 & 0 & \cdots & \mu_o \end{bmatrix}

Repeated eigenvalues, defective: a real eigenvalue \gamma_k with algebraic multiplicity m and geometric multiplicity g < m corresponds to g Jordan blocks. The sizes of these blocks sum to m. A block of size p \times p has the form:

\mathbf{J}_4 = \begin{bmatrix} \gamma_1 & 1 & \cdots & 0 & 0 \\ 0 & \gamma_1 & \cdots & 0 & 0 \\ 0 & 0 & \ddots & 0 & 0 \\ 0 & 0 & \cdots & \gamma_p & 1 \\ 0 & 0 & \cdots & 0 & \gamma_p \\ \end{bmatrix}

The complete Jordan canonical form of a matrix is a block diagonal matrix where each block is one of the types described above.

\mathbf{J} = \begin{bmatrix} \mathbf{J}_1 & \mathbf{0} & \mathbf{0} & \mathbf{0} \\ \mathbf{0} & \mathbf{J}_2 & \mathbf{0} & \mathbf{0} \\ \mathbf{0} & \mathbf{0} & \mathbf{J}_3 & \mathbf{0} \\ \mathbf{0} & \mathbf{0} & \mathbf{0} & \mathbf{J}_4 \end{bmatrix}

This decomposition provides a theoretical tool for analyzing any linear system.

By transforming to the basis of generalized eigenvectors, any system \dot{\mathbf{x}}=\mathbf{A}\mathbf{x} becomes \dot{\mathbf{z}}=\mathbf{J}\mathbf{z}, which can be solved block by block, revealing the interplay of pure exponential modes and the secular terms arising from the defective, non-diagonalizable parts of the system.

Go to the top of the page