Non-diagonalizable Systems Matrix Exponential
Geometric and Algebraic Multiplicity
The solution methodology for systems of linear ordinary differential equations:
\dot{\mathbf{x}} = \mathbf{A} \mathbf{x}
relies on the diagonalizability of the matrix \mathbf{A}.
When \mathbf{A} possesses a full set of linearly independent eigenvectors, it can be diagonalized, and the solution is a pure superposition of exponential modes.
However, when the matrix has repeated eigenvalues, it may lack a complete basis of eigenvectors. Such matrices are termed defective or non-diagonalizable.
Their analysis requires an extension of the eigenvalue problem and leads to solutions containing terms that grow polynomially in time, known as secular terms.
The formal solution to any linear system remains:
\mathbf{x}(t) = e^{\mathbf{A}t} \mathbf{x}(0)
The challenge lies in computing the matrix exponential e^{\mathbf{A}t} when \mathbf{A} cannot be diagonalized.
A technique used involves decomposing \mathbf{A} into a sum of commuting matrices.
A fundamental property of the matrix exponential, which follows from the structure of its Taylor series expansion, is that for any two matrices \mathbf{S} and \mathbf{T} that commute (\mathbf{ST} = \mathbf{TS}), the exponential of their sum is the product of their exponentials.
e^{\mathbf{S}+\mathbf{T}} = e^{\mathbf{S}} e^{\mathbf{T}} \iff \mathbf{ST} = \mathbf{TS}
This can be proven using the binomial theorem and the fact that \mathbf{S} \mathbf{T} = \mathbf{T} \mathbf{S}:
\begin{aligned} & (\mathbf{S}+\mathbf{T})^n = n! \sum_{j+k=n} \dfrac{\mathbf{S}^j\mathbf{T}^k}{j!k!} \\ & e^{\mathbf{M}} = \mathbf{I} + \mathbf{M} + \frac{1}{2}\mathbf{M}^2 + \frac{1}{3!} \mathbf{M}^3 + \cdots \\ & e^{(\mathbf{S+T})} = \mathbf{I} + (\mathbf{S+T}) + \frac{1}{2}(\mathbf{S+T})^2 + \frac{1}{3!} (\mathbf{S+T})^3 + \cdots \\ & = \mathbf{I} + \left(1! \sum_{j+k=1} \dfrac{\mathbf{S}^j\mathbf{T}^k}{j!k!}\right) + \frac{1}{2} \left(2! \sum_{j+k=2} \dfrac{\mathbf{S}^j\mathbf{T}^k}{j!k!}\right) + \frac{1}{3!} \left(3! \sum_{j+k=3} \dfrac{\mathbf{S}^j\mathbf{T}^k}{j!k!}\right) + \cdots \\ &= \mathbf{I} + \left(\mathbf{S} + \mathbf{T} \right) + \left( \dfrac{\mathbf{S}^2}{2!} + \mathbf{S}\mathbf{T} +\dfrac{\mathbf{T}^2}{2!} \right) + \left( \dfrac{\mathbf{S}^3}{3!} + \dfrac{\mathbf{S}^2\mathbf{T}}{2!} +\dfrac{\mathbf{S}\mathbf{T}^2}{2!} + \dfrac{\mathbf{T}^3}{3!} \right) + \cdots \\ & = \mathbf{I} + \mathbf{S} + \frac{1}{2}\mathbf{S}^2 + \frac{1}{3!} \mathbf{S}^3 +\mathbf{T} + \mathbf{S}\mathbf{T} + \frac{1}{2}\mathbf{S}^2\mathbf{T} + \cdots \\ & e^{\mathbf{S}}e^{\mathbf{T}} = \left( \mathbf{I} + \mathbf{S} + \frac{1}{2}\mathbf{S}^2 + \frac{1}{3!} \mathbf{S}^3 + \cdots \right) \left( \mathbf{I} + \mathbf{T} + \frac{1}{2}\mathbf{T}^2 + \frac{1}{3!} \mathbf{T}^3 + \cdots \right) \\ & = \mathbf{I} + \mathbf{S} + \frac{1}{2}\mathbf{S}^2 + \frac{1}{3!} \mathbf{S}^3 +\left( \mathbf{I} + \mathbf{S} + \frac{1}{2}\mathbf{S}^2 + \frac{1}{3!} \mathbf{S}^3 \right) \mathbf{T} + \left( \mathbf{I} + \mathbf{S} + \frac{1}{2}\mathbf{S}^2 + \frac{1}{3!} \mathbf{S}^3 \right) \frac{1}{2}\mathbf{T}^2 + \cdots \\ & = \mathbf{I} + \mathbf{S} + \frac{1}{2}\mathbf{S}^2 + \frac{1}{3!} \mathbf{S}^3 +\mathbf{T} + \mathbf{S}\mathbf{T} + \frac{1}{2}\mathbf{S}^2\mathbf{T} + \cdots \end{aligned}
In the above, it was considered that:
\begin{aligned} (\mathbf{S+T})^2 & = \mathbf{S}^2 + \mathbf{ST} + \mathbf{TS} + \mathbf{T}^2 \\ & = \mathbf{S}^2 + \mathbf{2ST} + \mathbf{T}^2 \iff \mathbf{ST} = \mathbf{TS} \end{aligned}
which (in general) isn’t true.
Let us analyze the canonical example of a 2 \times 2 non-diagonalizable matrix, a Jordan block:
\mathbf{A} = \begin{bmatrix} \lambda & 1 \\ 0 & \lambda \end{bmatrix}
This matrix has a single eigenvalue \lambda with an algebraic multiplicity of two. We can decompose \mathbf{A} into a diagonal part \mathbf{S} and a nilpotent part \mathbf{N}.
\mathbf{A} = \begin{bmatrix} \lambda & 0 \\ 0 & \lambda \end{bmatrix} + \begin{bmatrix} 0 & 1 \\ 0 & 0 \end{bmatrix} = \mathbf{S} + \mathbf{N}
The matrix \mathbf{S} = \lambda\mathbf{I} is a scalar matrix and therefore commutes with any matrix. Thus, \mathbf{S}\mathbf{N} = \mathbf{N}\mathbf{S}, and we can write e^{\mathbf{A}t} = e^{(\mathbf{S}+\mathbf{N})t} = e^{\mathbf{S}t}e^{\mathbf{N}t}.
The exponential of the diagonal part is straightforward:
e^{\mathbf{S}t} = e^{\lambda\mathbf{I}t} = \begin{bmatrix} e^{\lambda t} & 0 \\ 0 & e^{\lambda t} \end{bmatrix}
The matrix \mathbf{N} is nilpotent because \mathbf{N}^2 = \mathbf{0}. This property causes its Taylor series to truncate after the linear term:
e^{\mathbf{N}t} = \mathbf{I} + \mathbf{N}t + \frac{1}{2!}(\mathbf{N}t)^2 + \cdots = \mathbf{I} + \mathbf{N}t = \begin{bmatrix} 1 & t \\ 0 & 1 \end{bmatrix}
Combining these results gives the solution matrix:
e^{\mathbf{A}t} = e^{\mathbf{S}t}e^{\mathbf{N}t} = \begin{bmatrix} e^{\lambda t} & 0 \\ 0 & e^{\lambda t} \end{bmatrix} \begin{bmatrix} 1 & t \\ 0 & 1 \end{bmatrix} = \begin{bmatrix} e^{\lambda t} & t e^{\lambda t} \\ 0 & e^{\lambda t} \end{bmatrix}
The resulting solution for \mathbf{x}(t) is:
\mathbf{x}(t) = \begin{bmatrix} e^{\lambda t} & t e^{\lambda t} \\ 0 & e^{\lambda t} \end{bmatrix} \mathbf{x}(0)
The term t e^{\lambda t} is a secular term. Its presence signifies that even if the real part of \lambda is negative, ensuring long-term decay, the linear factor t can cause significant transient growth before the exponential decay dominates. This is another mechanism for transient amplification in stable systems, distinct from the non-orthogonality of eigenvectors in non-normal systems.
The diagonalizability of a matrix is determined by the relationship between two concepts of multiplicity for its eigenvalues.
The algebraic multiplicity of an eigenvalue \lambda is its multiplicity as a root of the characteristic polynomial \det(\mathbf{A}-\lambda\mathbf{I})=0.
The geometric multiplicity of an eigenvalue \lambda is the dimension of its corresponding eigenspace, which is the null space of (\mathbf{A}-\lambda\mathbf{I}). This dimension is equal to the maximum number of linearly independent eigenvectors that can be found for that eigenvalue.
A fundamental theorem of linear algebra states that a matrix is diagonalizable if and only if, for every eigenvalue, its geometric multiplicity is equal to its algebraic multiplicity.
Consider the two matrices:
\begin{aligned} & \mathbf{A}_1 = \begin{bmatrix} \lambda & 0 \\ 0 & \lambda \end{bmatrix} \\ & \mathbf{A}_2 = \begin{bmatrix} \lambda & 1 \\ 0 & \lambda \end{bmatrix} \end{aligned}
Both have the characteristic equation (\lambda'-\lambda)^2=0, so for both, the eigenvalue \lambda has an algebraic multiplicity of two (\lambda_{1,2}=\lambda).
Let’s compute the eigenvectors. In the first case:
\begin{aligned} & \left(\mathbf{A}_1 - \lambda \, \mathbf{I}\right) \, \mathbf{x} = 0 \\ & \begin{bmatrix} 0 & 0 \\ 0 & 0 \end{bmatrix} \begin{bmatrix} x_{11} \\ x_{12} \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \end{bmatrix} \end{aligned}
This equation is valid \forall \mathbf{x}, it is possible to chose the vectors \xi_1 = [1,0]^T and \xi_2 = [0,1]^T.
In the second case:
\begin{aligned} & \left(\mathbf{A}_2 - \lambda \, \mathbf{I}\right) \, \mathbf{x} = 0 \\ & \begin{bmatrix} 0 & 1 \\ 0 & 0 \end{bmatrix} \begin{bmatrix} x_{11} \\ x_{12} \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \end{bmatrix} \end{aligned}
In this case, only one eigenvector \xi_1 = [0,1]^T satisfy this system, it is not possible to find another and therefore the matrix is not diagonalizable.
It is possible only to find a vector which is in the null vector of
\left(\mathbf{A}_2 - \lambda \, \mathbf{I}\right)^2 \, \mathbf{x} = 0
which will give a generalized eigenvector of the matrix \mathbf{A}_2.
\begin{aligned} & \left(\mathbf{A}_2 - \lambda \, \mathbf{I}\right)^2 \, \mathbf{x} = 0 \\ & \begin{bmatrix} 0 & 1 \\ 0 & 0 \end{bmatrix} \begin{bmatrix} 0 & 1 \\ 0 & 0 \end{bmatrix} \begin{bmatrix} x_{21} \\ x_{22} \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \end{bmatrix} \\ & \begin{bmatrix} 0 & 0 \\ 0 & 0 \end{bmatrix} \begin{bmatrix} x_{21} \\ x_{22} \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \end{bmatrix} \end{aligned}
Therefore, the generalized eigenvector is \xi_2 = [0,0]^T
When a matrix is defective, a full basis of eigenvectors does not exist. However, it is always possible to find a complete basis of generalized eigenvectors.
A generalized eigenvector corresponding to an eigenvalue \lambda is a non-zero vector \mathbf{v} that satisfies:
(\mathbf{A}-\lambda\mathbf{I})^k \mathbf{v} = \mathbf{0}
for some integer k \ge 1. The set of all such vectors for a given \lambda forms the generalized eigenspace.
The Jordan Canonical Form (JCF) theorem states that any square matrix \mathbf{A} over an algebraically closed field (such as \mathbb{C}) is similar to a block diagonal matrix \mathbf{J}, called its Jordan form:
\mathbf{A} = \mathbf{P} \mathbf{J} \mathbf{P}^{-1}
The matrix \mathbf{P} contains the generalized eigenvectors of \mathbf{A}, and \mathbf{J} is composed of Jordan blocks on its diagonal.
The structure of \mathbf{J} fully describes the eigensystem of \mathbf{A}.
For a real matrix \mathbf{A}, the real Jordan form consists of blocks corresponding to its real and complex conjugate eigenvalues.
A general n \times n real matrix can have eigenvalues falling into four categories, each corresponding to a specific structure within its real Jordan form \mathbf{J}.
Distinct real eigenvalues: each distinct real eigenvalue \lambda_i corresponds to a 1 \times 1 block [\lambda_i] on the diagonal:
\mathbf{J}_1 = \begin{bmatrix} \lambda_1 & 0 & \cdots & 0 \\ 0 & \lambda_2 & \cdots & 0 \\ 0 & 0 & \ddots & 0 \\ 0 & 0 & \cdots & \lambda_m \end{bmatrix}
Complex conjugate eigenvalues: each pair of complex conjugate eigenvalues \alpha_k \pm i\omega_k corresponds to a 2 \times 2 block of the form:
\mathbf{J}_2 = \begin{bmatrix} \lambda_{m+1} & \omega_1 & 0 & 0 & 0 \\ \omega_1 & \lambda_{m+1} & 0 & 0 & 0 \\ 0 & 0 & \ddots & 0 & 0 \\ 0 & 0 & \cdots & \lambda_{n} & \omega_{n-m-1} \\ 0 & 0 & \cdots & \omega_{n-m-1} & \lambda_{n} \end{bmatrix}
Repeated eigenvalues, non-defective: a real eigenvalue \mu_j with algebraic multiplicity m and geometric multiplicity m corresponds to m separate 1 \times 1 blocks [\mu_j]:
\mathbf{J}_3 = \begin{bmatrix} \mu_1 & 0 & \cdots & 0 \\ 0 & \mu_2 & \cdots & 0 \\ 0 & 0 & \ddots & 0 \\ 0 & 0 & \cdots & \mu_o \end{bmatrix}
Repeated eigenvalues, defective: a real eigenvalue \gamma_k with algebraic multiplicity m and geometric multiplicity g < m corresponds to g Jordan blocks. The sizes of these blocks sum to m. A block of size p \times p has the form:
\mathbf{J}_4 = \begin{bmatrix} \gamma_1 & 1 & \cdots & 0 & 0 \\ 0 & \gamma_1 & \cdots & 0 & 0 \\ 0 & 0 & \ddots & 0 & 0 \\ 0 & 0 & \cdots & \gamma_p & 1 \\ 0 & 0 & \cdots & 0 & \gamma_p \\ \end{bmatrix}
The complete Jordan canonical form of a matrix is a block diagonal matrix where each block is one of the types described above.
\mathbf{J} = \begin{bmatrix} \mathbf{J}_1 & \mathbf{0} & \mathbf{0} & \mathbf{0} \\ \mathbf{0} & \mathbf{J}_2 & \mathbf{0} & \mathbf{0} \\ \mathbf{0} & \mathbf{0} & \mathbf{J}_3 & \mathbf{0} \\ \mathbf{0} & \mathbf{0} & \mathbf{0} & \mathbf{J}_4 \end{bmatrix}
This decomposition provides a theoretical tool for analyzing any linear system.
By transforming to the basis of generalized eigenvectors, any system \dot{\mathbf{x}}=\mathbf{A}\mathbf{x} becomes \dot{\mathbf{z}}=\mathbf{J}\mathbf{z}, which can be solved block by block, revealing the interplay of pure exponential modes and the secular terms arising from the defective, non-diagonalizable parts of the system.