Hartman-Grobman Theorem Mathematical Proof

Hartman-Grobman Theorem

The Hartman-Grobman Theorem establishes a connection between a nonlinear dynamical system and its linearization around a hyperbolic equilibrium point.

It asserts that, in a sufficiently small neighborhood of such a point, the flow of the nonlinear system is topologically equivalent to the flow of its linear approximation.

This result permits the analysis of a complex local structure through the study of a simpler, linear counterpart. The theorem is complex so I propose a short version with the outline of the proof here, and a version with the formal proof here.

Section	Title	Short	Long
Lemma 1	Spectral decomposition and coordinate transformation	Here	Here
Setup	Introduction and Setup	Here	Here
Lemma 2	Localization of the flow	Here	Here
Lemma 3	Localized flow via a cutoff function	Here	Here
Lemma 4	Invertibility of perturbed linear maps	Here	Here
Theorem	Theorem statement and proof strategy	N/A	Here
Lemma 5	Construction of the conjugacy candidate	Here	Here
Lemma 6	Generalized conjugacy construction	Here	Here
Lemma 7	The conjugacy is a homeomorphism	Here	Here
Lemma 8	From time-one conjugacy to flow equivalence	Here	Here

Outline of the proof

Setup

We begin by defining the setting. Let \mathbf{E} \subset \mathbb{R}^n be an open set and let f \in C^1(\mathbf{E}, \mathbb{R}^n) define a vector field. Consider a point \mathbf{x}_0 \in \mathbf{E} that is a fixed point, or equilibrium, of the system, meaning f(\mathbf{x}_0) = \mathbf{0}. We are interested in the behavior of solutions to the initial value problem:

\begin{cases} \dot{\mathbf{y}} = f(\mathbf{x}) \\ \mathbf{y}(0) = \mathbf{x} \end{cases}

The theorem states that a homeomorphism exists that maps the trajectories of the nonlinear flow onto the trajectories of the linear flow e^{t\mathbf{A}}. The proof is technical and is constructed through a sequence of lemmas. For clarity, we will assume the fixed point is translated to the origin, \mathbf{x}_0 = \mathbf{0}.

Spectral decomposition and coordinate transformation

Lemma 1: let \mathbf{A} be a real hyperbolic matrix. There exists a linear change of variables that induces a decomposition of the space \mathbb{R}^n into a direct sum of a stable subspace \mathbf{E}_s and an unstable subspace \mathbf{E}_u, such that \mathbb{R}^n = \mathbf{E}_s \oplus \mathbf{E}_u.

In this new coordinate system, the matrix \mathbf{A} takes a block-diagonal form:

\mathbf{A} = \begin{bmatrix} \mathbf{A}_s & \mathbf{0} \\ \mathbf{0} & \mathbf{A}_u \end{bmatrix}

Furthermore, there exists a constant \alpha > 0 such that for any vector \mathbf{x} = \mathbf{x}_s + \mathbf{x}_u, with \mathbf{x}_s = \mathbf{P}_s\mathbf{x} \in \mathbf{E}_s and \mathbf{x}_u = \mathbf{P}_u\mathbf{x} \in \mathbf{E}_u being the projections onto these subspaces, the following exponential estimates hold for all t \ge 0:

\begin{aligned} & |e^{t\mathbf{A}}\mathbf{x}_s| \le e^{-\alpha t} |\mathbf{x}_s| \\ & |e^{-t\mathbf{A}}\mathbf{x}_u| \le e^{-\alpha t} |\mathbf{x}_u| \end{aligned}

Proof outline: the proof begins by transforming \mathbf{A} into its real Jordan canonical form, \mathbf{J} = \mathbf{P}^{-1}\mathbf{AP}. This form is block-diagonal, where each block corresponds to real eigenvalues or complex conjugate pairs.

A Jordan block \mathbf{J}_i can be written as \mathbf{J}_i = \mathbf{D}_i + \mathbf{N}_i, where \mathbf{D}_i is diagonal (or contains 2 \times 2 rotation-scaling blocks) and \mathbf{N}_i is a nilpotent matrix representing the off-diagonal entries. The hyperbolicity condition ensures that for some \alpha_0 > 0, the real parts of the eigenvalues of \mathbf{A}_s are less than -\alpha_0 and the real parts of the eigenvalues of \mathbf{A}_u are greater than \alpha_0.

The presence of the nilpotent part \mathbf{N} complicates direct estimation. To refine the bounds, a second change of coordinates is introduced via a scaling matrix \mathbf{Q}. For a suitable small parameter \delta > 0, \mathbf{Q} scales the basis vectors within each Jordan block.

This has the effect of reducing the norm of the nilpotent part. In the new coordinates, the matrix becomes \mathbf{A}' = \mathbf{Q}^{-1}\mathbf{J}\mathbf{Q} = \mathbf{D} + \delta \mathbf{N}'. By choosing \delta sufficiently small, we can ensure that the contribution from the nilpotent part does not overwhelm the exponential behavior dictated by \mathbf{D}.

We can find an \alpha > 0 such that the eigenvalues of \mathbf{A}_s' have real parts less than -\alpha and those of \mathbf{A}_u' have real parts greater than \alpha. The desired inequalities then follow from analyzing the resulting system. For instance, for the stable part \dot{\mathbf{z}}_s = \mathbf{A}_s' \mathbf{z}_s, the norm of the solution satisfies | \mathbf{z}_s(t) | \le e^{-\alpha t} |\mathbf{z}_s(0)|.

Localization of the flow

Lemma 2: let \mathbf{D}_d = \{\mathbf{x} \in \mathbb{R}^n : |\mathbf{x}| \le d\} be the closed ball of radius d. If f \in C^1 and f(\mathbf{0})=\mathbf{0}, then there exists a constant c_1 \in (0, 1), depending on f and d, such that for any d_1 \le d, any solution \varphi(t, \mathbf{x}) with initial condition \mathbf{x} \in \mathbf{D}_{c_1 d_1} exists for all t \in [-1, 1] and remains within \mathbf{D}_{d_1}.

Proof outline: since f \in C^1(\mathbf{E}) and f(\mathbf{0}) = \mathbf{0}, by the Mean Value Theorem applied along the segment from \mathbf{0} to \mathbf{x}, we have an estimate |f(\mathbf{x})| \le M |\mathbf{x}| for \mathbf{x} \in \mathbf{D}_d, where M = \sup_{\mathbf{z} \in \mathbf{D}_d} \|Df(\mathbf{z})\|. The integral form of the differential equation is \varphi(t, \mathbf{x}) = \mathbf{x} + \int_0^t f(\varphi(s, \mathbf{x})) \mathrm{d}s. Taking the norm gives:

|\varphi(t, \mathbf{x})| \le |\mathbf{x}| + \int_0^t |f(\varphi(s, \mathbf{x}))| \mathrm{d}s \le |\mathbf{x}| + M \int_0^t |\varphi(s, \mathbf{x})| \mathrm{d}s

Applying Gronwall’s inequality to this integral inequality yields:

|\varphi(t, \mathbf{x})| \le |\mathbf{x}| e^{M|t|}

If we choose c_1 = e^{-M} and require |\mathbf{x}| \le c_1 d_1, then for |t| \le 1, we get |\varphi(t, \mathbf{x})| \le (c_1 d_1) e^{M|t|} = d_1 e^{M(|t|-1)} \le d_1. The solution remains in the ball \mathbf{D}_{d_1}, which guarantees its existence for the time interval [-1, 1].

Localized flow via a cutoff function

Lemma 3: for any \alpha > 0, it is possible to choose a radius d_2 sufficiently small such that there exists a function \mathbf{p} \in C^1(\mathbb{R}^n, \mathbb{R}^n) with the properties:

\mathbf{p}(\mathbf{0}) = \mathbf{0} and D\mathbf{p}(\mathbf{0}) = \mathbf{0};
the C^1 norm satisfies \|\mathbf{p}\|_{C^1(\mathbb{R}^n)} < \alpha;
the time-one map of the flow, \varphi_1(\mathbf{x}) = \varphi(1, \mathbf{x}), can be expressed as \varphi_1(\mathbf{x}) = e^{\mathbf{A}}\mathbf{x} + \mathbf{p}(\mathbf{x}) for all \mathbf{x} \in \mathbf{D}_{d_2}.

Proof outline: we introduce a smooth cutoff function \beta_{d_2}(\mathbf{x}) that equals 1 for |\mathbf{x}| \le d_2 and 0 for |\mathbf{x}| \ge 2d_2. We define the perturbation \mathbf{p}(\mathbf{x}) as the difference between the nonlinear and linear time-one maps, localized by \beta_{d_2}:

\mathbf{p}(\mathbf{x}) = \beta_{d_2}(\mathbf{x}) (\varphi_1(\mathbf{x}) - e^{\mathbf{A}}\mathbf{x})

This function \mathbf{p} is identically zero outside the ball \mathbf{D}_{2d_2}. Since \varphi_1(\mathbf{0})=\mathbf{0} and D\varphi_1(\mathbf{0}) = e^{\mathbf{A}}, it follows that \mathbf{p}(\mathbf{0})=\mathbf{0} and D\mathbf{p}(\mathbf{0})=\mathbf{0}. By the definition of the derivative, the term (\varphi_1(\mathbf{x}) - e^{\mathbf{A}}\mathbf{x}) is o(|\mathbf{x}|).

We can choose d_2 to be small enough so that the magnitudes of this term and its derivative are controlled within \mathbf{D}_{2d_2}. A careful estimation of the C^0 and C^1 norms of \mathbf{p} shows that \|\mathbf{p}\|_{C^1} can be made smaller than any preassigned \alpha > 0 by shrinking d_2.

Invertibility of perturbed linear maps

Lemma 4: let \mathbf{T}(\mathbf{x}) = \mathbf{Bx} + \mathbf{p}(\mathbf{x}), where \mathbf{B} is an invertible matrix. If \mathbf{p} \in C^1(\mathbb{R}^n, \mathbb{R}^n) with \mathbf{p}(\mathbf{0}) = \mathbf{0} and its C^1 norm is sufficiently small, specifically \|\mathbf{p}\|_{C^1} \le \gamma where \gamma < \|\mathbf{B}^{-1}\|^{-1}, then \mathbf{T} is a homeomorphism from \mathbb{R}^n to \mathbb{R}^n.

Proof outline: to find an inverse for \mathbf{T}, we solve \mathbf{y} = \mathbf{Bx} + \mathbf{p}(\mathbf{x}) for \mathbf{x}. This is equivalent to finding a fixed point of the map \mathbf{g}_{\mathbf{y}}(\mathbf{x}) = \mathbf{B}^{-1}(\mathbf{y} - \mathbf{p}(\mathbf{x})). We show that \mathbf{g}_{\mathbf{y}} is a contraction. The condition \|\mathbf{p}\|_{C^1} \le \gamma implies that \mathbf{p} is Lipschitz with constant \gamma, so |\mathbf{p}(\mathbf{x}) - \mathbf{p}(\mathbf{z})| \le \gamma |\mathbf{x}-\mathbf{z}|.

\begin{aligned} |\mathbf{g}_{\mathbf{y}}(\mathbf{x}) - \mathbf{g}_{\mathbf{y}}(\mathbf{z})| &= |\mathbf{B}^{-1}(\mathbf{p}(\mathbf{z}) - \mathbf{p}(\mathbf{x}))| \\ &\le \|\mathbf{B}^{-1}\| |\mathbf{p}(\mathbf{z}) - \mathbf{p}(\mathbf{x})| \\ &\le \|\mathbf{B}^{-1}\| \gamma |\mathbf{x}-\mathbf{z}| \end{aligned}

Since \|\mathbf{B}^{-1}\|\gamma < 1, the map is a contraction. The Banach Fixed-Point Theorem guarantees the existence of a unique fixed point \mathbf{x} = \mathbf{T}^{-1}(\mathbf{y}) for each \mathbf{y}. This inverse is shown to be continuous (in fact, Lipschitz). To establish that \mathbf{T}^{-1} is C^1, we examine its differential \mathbf{DT}(\mathbf{x}) = \mathbf{B} + D\mathbf{p}(\mathbf{x}). We write this as \mathbf{B}(\mathbf{I} + \mathbf{B}^{-1}D\mathbf{p}(\mathbf{x})).

The norm condition \|D\mathbf{p}\| \le \gamma < \|\mathbf{B}^{-1}\|^{-1} ensures that \|\mathbf{B}^{-1}D\mathbf{p}(\mathbf{x})\| < 1. Therefore, the matrix (\mathbf{I} + \mathbf{B}^{-1}D\mathbf{p}(\mathbf{x})) is invertible via a convergent Neumann series. This implies \mathbf{DT}(\mathbf{x}) is always invertible, so by the Inverse Function Theorem, \mathbf{T} is a local diffeomorphism, and since it is bijective, it is a global diffeomorphism.

Construction of the conjugacy candidate

Lemma 5: Let \mathbf{B} = e^{\mathbf{A}} and let \mathbf{T}(\mathbf{x}) = \mathbf{Bx} + \mathbf{p}(\mathbf{x}) be the map from Lemma 3, with \|\mathbf{p}\|_{C^1} chosen to be sufficiently small. There exists a unique bounded continuous map \mathbf{H}:\mathbb{R}^n \to \mathbb{R}^n of the form \mathbf{H}(\mathbf{x}) = \mathbf{x} + \mathbf{h}(\mathbf{x}) that satisfies the conjugacy equation:

\mathbf{H}(\mathbf{T}(\mathbf{x})) = \mathbf{B}\mathbf{H}(\mathbf{x})

Proof outline: we seek a solution \mathbf{h} in the Banach space \mathbf{X} of bounded continuous functions on \mathbb{R}^n. Substituting \mathbf{H}(\mathbf{x}) = \mathbf{x} + \mathbf{h}(\mathbf{x}) into the conjugacy equation gives a functional equation for \mathbf{h}:

\mathbf{T}(\mathbf{x}) + \mathbf{h}(\mathbf{T}(\mathbf{x})) = \mathbf{B}(\mathbf{x} + \mathbf{h}(\mathbf{x})) \implies \mathbf{h}(\mathbf{T}(\mathbf{x})) = \mathbf{B}\mathbf{h}(\mathbf{x}) - \mathbf{p}(\mathbf{x})

This single equation is split into components along the stable and unstable subspaces. For the stable part, \mathbf{B}_s = e^{\mathbf{A}_s} is a contraction, while for the unstable part, \mathbf{B}_u = e^{\mathbf{A}_u} is an expansion. To obtain a contraction in both cases, we rearrange the unstable equation:

\begin{cases} \mathbf{h}_s (\mathbf{x}) = \mathbf{B}_s \mathbf{h}_s ( \mathbf{T}^{-1}(\mathbf{x})) + \mathbf{p}_s (\mathbf{T}^{-1}(\mathbf{x})) \\ \mathbf{h}_u (\mathbf{x}) = \mathbf{B}_u^{-1}\mathbf{h}_u (\mathbf{T}(\mathbf{x})) + \mathbf{B}_u^{-1} \mathbf{p}_u (\mathbf{x}) \end{cases}

These equations define an operator \mathcal{G} on the space \mathbf{X}. Using the exponential bounds from Lemma 1 and the properties of \mathbf{T} and \mathbf{p}, one can show that if \|\mathbf{p}\|_{C^1} is small enough, \mathcal{G} is a contraction mapping on \mathbf{X}. The Banach Fixed-Point Theorem then yields a unique solution \mathbf{h} \in \mathbf{X}, which in turn defines the map \mathbf{H}.

{#short-lemma-6}Lemma 6 is very similar to this lemma and is demonstrated in the detailed section.

The conjugacy is a homeomorphism

Lemma 7: The map \mathbf{H} constructed in Lemma 5 is a homeomorphism.

Proof outline: the proof relies on constructing an inverse \mathbf{K} for \mathbf{H}. We consider a “dual” conjugacy problem to find a map \mathbf{K} satisfying \mathbf{T} \circ \mathbf{K} = \mathbf{K} \circ \mathbf{B}. An argument identical to that of Lemma 5 provides a unique continuous solution \mathbf{K}(\mathbf{y}) = \mathbf{y} + \mathbf{k}(\mathbf{y}).

We then examine the composition \mathbf{G} = \mathbf{H} \circ \mathbf{K}. We can show it satisfies \mathbf{G} \circ \mathbf{B} = \mathbf{B} \circ \mathbf{G}:

\mathbf{H} \circ \mathbf{K} (\mathbf{By}) = \mathbf{H} \circ (\mathbf{K} \circ \mathbf{B})(\mathbf{y}) = \mathbf{H} \circ (\mathbf{T} \circ \mathbf{K})(\mathbf{y}) = (\mathbf{H} \circ \mathbf{T}) \circ \mathbf{K}(\mathbf{y}) = (\mathbf{B} \circ \mathbf{H}) \circ \mathbf{K}(\mathbf{y}) = \mathbf{B} \circ (\mathbf{H} \circ \mathbf{K})(\mathbf{y})

The map \mathbf{G} solves the original functional equation with \mathbf{p}=\mathbf{0}. The identity map \mathbf{I} is also a solution. By the uniqueness part of the contraction mapping argument in Lemma 5, we must have \mathbf{H} \circ \mathbf{K} = \mathbf{I}. A symmetric argument shows that \mathbf{K} \circ \mathbf{H} = \mathbf{I}, establishing that \mathbf{K} is the inverse of \mathbf{H}, and therefore \mathbf{H} is a homeomorphism.

From time-one conjugacy to flow equivalence

Lemma 8: the homeomorphism \mathbf{H} conjugates the time-one map \varphi_1 and its linearization e^\mathbf{A} in a neighborhood of the origin. This can be extended to a local flow equivalence. The map \mathcal{H} defined by the integral:

\mathcal{H}(\mathbf{x}) = \int_0^1 e^{-s\mathbf{A}}\mathbf{H}(\varphi(s, \mathbf{x})) \mathrm{d}s

is a local topological flow equivalence in a neighborhood of the origin.

Proof outline: we must show that \mathcal{H}(\varphi(t, \mathbf{x})) = e^{t\mathbf{A}}\mathcal{H}(\mathbf{x}). Consider the right-hand side, then apply the definition of \mathcal{H} to \varphi(t, \mathbf{x}):

\begin{aligned} \mathcal{H}(\varphi(t, \mathbf{x})) &= \int_0^1 e^{-s\mathbf{A}}\mathbf{H}(\varphi(s, \varphi(t, \mathbf{x}))) \mathrm{d}s \\ &= \int_0^1 e^{-s\mathbf{A}}\mathbf{H}(\varphi(s+t, \mathbf{x})) \mathrm{d}s \end{aligned}

Let \tau = s+t. The integral becomes:

\int_t^{t+1} e^{-(\tau-t)\mathbf{A}}\mathbf{H}(\varphi(\tau, \mathbf{x})) \mathrm{d}\tau = e^{t\mathbf{A}} \int_t^{t+1} e^{-\tau\mathbf{A}}\mathbf{H}(\varphi(\tau, \mathbf{x})) \mathrm{d}\tau

Using the property \mathbf{H}(\varphi_1(\mathbf{z})) = e^\mathbf{A} \mathbf{H}(\mathbf{z}), which implies e^{-\mathbf{A}}\mathbf{H}(\varphi_1(\mathbf{z})) = \mathbf{H}(\mathbf{z}), we can show that the integral \int_t^{t+1} e^{-\tau\mathbf{A}}\mathbf{H}(\varphi(\tau, \mathbf{x})) \mathrm{d}\tau is equal to \int_0^1 e^{-\tau\mathbf{A}}\mathbf{H}(\varphi(\tau, \mathbf{x})) \mathrm{d}\tau. This is done by splitting the integral and performing a change of variables. The result is:

\mathcal{H}(\varphi(t, \mathbf{x})) = e^{t\mathbf{A}}\mathcal{H}(\mathbf{x})

Finally, by setting t=1, we see that \mathcal{H} satisfies the same conjugacy equation as \mathbf{H}. By the uniqueness established in Lemma 5, we must have \mathcal{H}=\mathbf{H} in their common domain of definition. Since \mathbf{H} is a local homeomorphism, so is \mathcal{H}, and it concludes the proof.

Formal proof

Setup

We consider a system of autonomous ordinary differential equations described by a vector field f. Let \mathbf{E} be an open subset of \mathbb{R}^n, and let the function f: \mathbf{E} \to \mathbb{R}^n be of class C^1. We are interested in the behavior of the system near an equilibrium point \mathbf{x}_0 \in \mathbf{E}, which is defined by the condition f(\mathbf{x}_0) = \mathbf{0}. The dynamics of the system are governed by the initial value problem

\begin{cases} \dot{\mathbf{x}}(t) = f(\mathbf{x}(t)) \\ \mathbf{x}(0) = \mathbf{x}_{\text{init}} \end{cases}

The solution to this problem, denoted as the flow \varphi(t, \mathbf{x}_{\text{init}}), describes the trajectory of a point starting at \mathbf{x}_{\text{init}}.

The local behavior of the system near the equilibrium \mathbf{x}_0 is intimately related to its linearization. The linear approximation is given by the Jacobian matrix of the vector field evaluated at the equilibrium point:

\mathbf{A} = Df(\mathbf{x}_0)

The Hartman-Grobman theorem applies to the specific case where the equilibrium is hyperbolic. This property is determined by the spectrum of the matrix \mathbf{A}. An equilibrium point \mathbf{x}_0 is hyperbolic if none of the eigenvalues \lambda_1, \dots, \lambda_n of \mathbf{A} have a real part equal to zero:

\Re(\lambda_i) \neq 0, \quad \forall i \in \{1, \dots, n\}

The theorem establishes that the topological structure of the nonlinear flow \varphi(t, \mathbf{x}) in a neighborhood of a hyperbolic equilibrium is equivalent to the structure of the linear flow e^{t\mathbf{A}} near the origin.

The proof of this statement is constructed upon several preparatory lemmas. Without loss of generality, we will assume the equilibrium point has been translated to the origin, so that \mathbf{x}_0 = \mathbf{0}.

Spectral decomposition and coordinate transformation

Lemma 1: let \mathbf{A} be a real hyperbolic n \times n matrix. There exists a linear change of coordinates in \mathbb{R}^n that decomposes the space into a direct sum of a stable subspace \mathbf{E}_s and an unstable subspace \mathbf{E}_u, such that \mathbb{R}^n = \mathbf{E}_s \oplus \mathbf{E}_u.

In this new coordinate system, the matrix \mathbf{A} assumes a block-diagonal form corresponding to this decomposition. Moreover, there exists a constant \alpha > 0 such that for any vector \mathbf{x} decomposed as \mathbf{x} = \mathbf{x}_s + \mathbf{x}_u where \mathbf{x}_s \in \mathbf{E}_s and \mathbf{x}_u \in \mathbf{E}_u, the evolution under the linear flow satisfies the following exponential bounds for all t \ge 0:

\begin{aligned} & |e^{t\mathbf{A}}\mathbf{x}_s| \le e^{-\alpha t} |\mathbf{x}_s| \\ & |e^{-t\mathbf{A}}\mathbf{x}_u| \le e^{-\alpha t} |\mathbf{x}_u| \end{aligned}

Proof: the proof proceeds in two main stages: first, a transformation to the real Jordan canonical form, and second, a scaling transformation to control the nilpotent part of the Jordan form.

The first step is a transformation to the real Jordan canonical form.

Since \mathbf{A} is a real matrix, there exists a real invertible matrix \mathbf{P} that transforms \mathbf{A} into its real Jordan canonical form, \mathbf{J} = \mathbf{P}^{-1}\mathbf{AP}.

This matrix \mathbf{J} is block-diagonal, where each block corresponds either to a real eigenvalue or a pair of complex conjugate eigenvalues. The structure of \mathbf{J} can be additively decomposed as:

\mathbf{J} = \mathbf{D} + \mathbf{N}

Here, \mathbf{D} is a block-diagonal matrix containing the diagonal parts of the Jordan blocks. For a real eigenvalue \lambda_i, the corresponding block in \mathbf{D} is \lambda_i \mathbf{I}.

For a complex conjugate pair of eigenvalues a_k \pm i b_k, the corresponding block in \mathbf{D} is a block-diagonal matrix of 2 \times 2 matrices of the form:

\begin{bmatrix} a_k & b_k \\ -b_k & a_k \end{bmatrix}

The matrix \mathbf{N} is strictly upper-triangular and therefore nilpotent; it contains the off-diagonal identity matrices within each Jordan block.

The hyperbolicity of \mathbf{A} implies that the real parts of its eigenvalues are non-zero.

Let \lambda_{\max,s} be the maximum real part among all eigenvalues with negative real part, and \lambda_{\min,u} be the minimum real part among all eigenvalues with positive real part.

By hyperbolicity, \lambda_{\max,s} < 0 < \lambda_{\min,u}. We can choose a constant \alpha_0 such that \lambda_{\max,s} < -\alpha_0 < 0 < \alpha_0 < \lambda_{\min,u}.

The presence of the nilpotent matrix \mathbf{N} is an obstruction to directly obtaining the desired exponential bounds, as the term e^{t\mathbf{N}} introduces polynomial growth in t.

The second step involve scaling the transformation.

To control the influence of the nilpotent part, we introduce a second change of variables via a diagonal scaling matrix \mathbf{Q}_\delta, which depends on a parameter \delta > 0.

For each Jordan block \mathbf{J}_i of size m_i \times m_i, the corresponding block in \mathbf{Q}_\delta is a diagonal matrix \mathbf{S}_i = \text{diag}(1, \delta, \delta^2, \dots, \delta^{m_i-1}) if the eigenvalue is real, or a block-diagonal matrix with blocks \text{diag}(1,1,\delta,\delta,\dots) if the eigenvalues are complex.

In the new coordinates defined by the overall transformation \mathbf{P}\mathbf{Q}_\delta, the matrix becomes:

\begin{aligned} \mathbf{A}^\prime &= (\mathbf{P}\mathbf{Q}_\delta)^{-1} \mathbf{A} (\mathbf{P}\mathbf{Q}_\delta) = \mathbf{Q}_\delta^{-1} (\mathbf{P}^{-1}\mathbf{AP}) \mathbf{Q}_\delta \\ &= \mathbf{Q}_\delta^{-1} (\mathbf{D} + \mathbf{N}) \mathbf{Q}_\delta = \mathbf{Q}_\delta^{-1}\mathbf{D}\mathbf{Q}_\delta + \mathbf{Q}_\delta^{-1}\mathbf{N}\mathbf{Q}_\delta \end{aligned}

Since \mathbf{D} and \mathbf{Q}_\delta are block-diagonal with compatible structures, they commute, so \mathbf{Q}_\delta^{-1}\mathbf{D}\mathbf{Q}_\delta = \mathbf{D}. The conjugation of \mathbf{N} by \mathbf{Q}_\delta scales its non-zero entries. Specifically, the entries of 1 on the superdiagonal of each Jordan block in \mathbf{N} are multiplied by \delta. Let us denote the transformed nilpotent matrix by \mathbf{N}_\delta = \mathbf{Q}_\delta^{-1}\mathbf{N}\mathbf{Q}_\delta. The norm of this new matrix satisfies \|\mathbf{N}_\delta\| \le C\delta for some constant C, and \|\mathbf{N}_\delta\| \to 0 as \delta \to 0.

Our transformed matrix is now \mathbf{A}^\prime = \mathbf{D} + \mathbf{N}_\delta. We can choose \delta > 0 to be sufficiently small such that \|\mathbf{N}_\delta\| < \alpha_0. Let us define a new constant \alpha = \alpha_0 - \|\mathbf{N}_\delta\| > 0.

We now analyze the flow generated by \mathbf{A}^\prime. Consider a vector \mathbf{z}_s in the stable subspace. The time derivative of its squared norm is

\begin{aligned} \frac{\mathrm d}{\mathrm dt} |\mathbf{z}_s|^2 &= \frac{\mathrm d}{\mathrm dt} (\mathbf{z}_s^T \mathbf{z}_s) = \dot{\mathbf{z}}_s^T \mathbf{z}_s + \mathbf{z}_s^T \dot{\mathbf{z}}_s \\ &= \mathbf{z}_s^T (\mathbf{A}^\prime_s)^T \mathbf{z}_s + \mathbf{z}_s^T \mathbf{A}^\prime_s \mathbf{z}_s = \mathbf{z}_s^T ((\mathbf{A}^\prime_s)^T + \mathbf{A}^\prime_s) \mathbf{z}_s \end{aligned}

The largest eigenvalue of the symmetric matrix (\mathbf{A}^\prime_s)^T + \mathbf{A}^\prime_s is bounded by 2(\lambda_{\max,s} + \|\mathbf{N}_{\delta,s}\|).

With our choice of \delta, this is bounded above by 2(-\alpha_0 + \|\mathbf{N}_\delta\|) = -2\alpha. This leads to the differential inequality

\frac{\mathrm d}{\mathrm dt} |\mathbf{z}_s|^2 \le -2\alpha |\mathbf{z}_s|^2

By Gronwall’s inequality, integrating this from 0 to t gives |\mathbf{z}_s(t)|^2 \le e^{-2\alpha t} |\mathbf{z}_s(0)|^2, which implies the desired result for the stable part:

|e^{t\mathbf{A}^\prime_s}\mathbf{z}_s| \le e^{-\alpha t} |\mathbf{z}_s|

A parallel argument holds for the unstable subspace \mathbf{E}_u. We consider the time-reversed dynamics for a vector \mathbf{z}_u \in \mathbf{E}_u. The matrix -\mathbf{A}^\prime_u = -\mathbf{D}_u - \mathbf{N}_{\delta,u} has eigenvalues whose real parts are bounded above by -\lambda_{\min,u} < -\alpha_0. Following the same procedure, we find that for t \ge 0:

|e^{-t\mathbf{A}^\prime_u}\mathbf{z}_u| \le e^{-\alpha t} |\mathbf{z}_u|

This completes the proof of the lemma.

We have found a coordinate system where the matrix of the linear system has the required block-diagonal structure and satisfies the specified exponential bounds.

For the remainder of the theorem’s proof, we will assume that the coordinates have been chosen such that \mathbf{A} itself has these properties.

Localization of the flow

This lemma provides a quantitative guarantee that trajectories of the nonlinear system starting sufficiently close to the equilibrium point will not escape a prescribed neighborhood within a fixed time interval. This control is essential for analyzing the local dynamics.

Let \mathbf{D}_r(\mathbf{x}_0) = \{ \mathbf{x} \in \mathbb{R}^n : |\mathbf{x} - \mathbf{x}_0| \le r \} denote the closed ball of radius r centered at \mathbf{x}_0.

Lemma 2: let \mathbf{E} \subset \mathbb{R}^n be an open set containing a ball \mathbf{D}_d(\mathbf{x}_0) for some d > 0. Let f \in C^1(\mathbf{E}, \mathbb{R}^n) be a vector field with a fixed point at \mathbf{x}_0, so f(\mathbf{x}_0) = \mathbf{0}. Let \varphi(t, \mathbf{x}) be the flow of the differential equation \dot{\mathbf{x}} = f(\mathbf{x}).

Then there exists a constant c_1 \in (0, 1), which depends on f and d, such that for any radius d_1 with 0 < d_1 \le d, if an initial condition \mathbf{x} is chosen within the smaller ball \mathbf{D}_{c_1 d_1}(\mathbf{x}_0), the resulting trajectory \varphi(t, \mathbf{x}) remains within the larger ball \mathbf{D}_{d_1}(\mathbf{x}_0) for all times t \in [-1, 1].

Proof: without loss of generality, we shift the coordinate system so that the fixed point is at the origin, \mathbf{x}_0 = \mathbf{0}.

First, we establish a local bound on the magnitude of the vector field f. Since f is of class C^1 on the open set \mathbf{E}, its derivative Df is continuous and therefore bounded on any compact subset of \mathbf{E}. Let us consider the compact ball \mathbf{D}_d(\mathbf{0}). We define the constant M as the supremum of the norm of the Jacobian on this ball:

M = \sup_{\mathbf{z} \in \mathbf{D}_d(\mathbf{0})} \|Df(\mathbf{z})\|

For any vector \mathbf{x} \in \mathbf{D}_d(\mathbf{0}), we can relate f(\mathbf{x}) to f(\mathbf{0}) using the Fundamental Theorem of Calculus. Consider the auxiliary function \mathbf{g}(s) = f(s\mathbf{x}) for s \in. Since f(\mathbf{0})=\mathbf{0}, we have:

\begin{aligned} |f(\mathbf{x})| &= |f(\mathbf{x}) - f(\mathbf{0})| = |\mathbf{g}(1) - \mathbf{g}(0)| = \left| \int_0^1 \mathbf{g}^\prime(s) \, \mathrm{d}s \right| \\ &= \left| \int_0^1 Df(s\mathbf{x})[\mathbf{x}] \, \mathrm{d}s \right| \end{aligned}

By applying the triangle inequality for integrals and the definition of the operator norm, we obtain:

|f(\mathbf{x})| \le \int_0^1 \|Df(s\mathbf{x})\| |\mathbf{x}| \, \mathrm{d}s \le \int_0^1 M |\mathbf{x}| \, \mathrm{d}s = M|\mathbf{x}|

This inequality establishes that f is locally Lipschitz-like with respect to the origin. Now, we use this property to bound the flow \varphi(t, \mathbf{x}). The solution to the initial value problem satisfies the integral equation:

\varphi(t, \mathbf{x}) = \mathbf{x} + \int_0^t f(\varphi(s, \mathbf{x})) \, \mathrm{d}s

Taking the norm of both sides and applying the triangle inequality, we get:

|\varphi(t, \mathbf{x})| \le |\mathbf{x}| + \left| \int_0^t |f(\varphi(s, \mathbf{x}))| \, \mathrm{d}s \right|

As long as the trajectory \varphi(s, \mathbf{x}) remains within the ball \mathbf{D}_d(\mathbf{0}), we can apply our bound on f:

|\varphi(t, \mathbf{x})| \le |\mathbf{x}| + \left| \int_0^t M |\varphi(s, \mathbf{x})| \, \mathrm{d}s \right|

This is an integral inequality of the Gronwall type.

The differential form of Gronwall’s inequality states that if a function u(t) satisfies u(t) \le A + \int_0^t B u(s) \mathrm{d}s for non-negative constants A, B, then u(t) \le A e^{Bt}.

Applying this to our situation (for t \ge 0) gives:

|\varphi(t, \mathbf{x})| \le |\mathbf{x}| e^{Mt}

A similar argument for t < 0 yields |\varphi(t, \mathbf{x})| \le |\mathbf{x}| e^{M|t|}.

Now we can choose the constant c_1 to satisfy the lemma claim.

Let us set c_1 = e^{-M}. By construction, 0 < c_1 < 1.

If we take an initial condition \mathbf{x} \in \mathbf{D}_{c_1 d_1}(\mathbf{0}), meaning |\mathbf{x}| \le c_1 d_1, then for any time t \in [-1, 1], we have:

|\varphi(t, \mathbf{x})| \le |\mathbf{x}| e^{M|t|} \le (c_1 d_1) e^{M|t|} = (e^{-M} d_1) e^{M|t|} = d_1 e^{M(|t|-1)}

Since |t| \le 1, the exponent M(|t|-1) is non-positive, which implies e^{M(|t|-1)} \le 1. Therefore, we conclude:

|\varphi(t, \mathbf{x})| \le d_1

This confirms that the trajectory starting in \mathbf{D}_{c_1 d_1}(\mathbf{0}) remains within \mathbf{D}_{d_1}(\mathbf{0}) for the entire time interval t \in [-1, 1].

According to the standard theory of ordinary differential equations, a solution can be extended as long as it remains within a compact subset of its domain of definition.

Since the trajectory is confined to the compact set \mathbf{D}_{d_1}(\mathbf{0}) \subset \mathbf{E}, its existence for t \in [-1,1] is guaranteed.

Localized flow via a cutoff function

We define the function space norms and the cutoff function that will be used.

For a function \mathbf{u}: \mathbf{E} \to \mathbb{R}^n, the C^0 norm is the supremum norm:

\|\mathbf{u}\|_{C^0(\mathbf{E})} = \sup_{\mathbf{x} \in \mathbf{E}} |\mathbf{u}(\mathbf{x})|

For a function \mathbf{u} \in C^1(\mathbf{E}, \mathbb{R}^n), the C^1 norm combines the function’s norm and the norm of its derivative:

\|\mathbf{u}\|_{C^1(\mathbf{E})} = \|\mathbf{u}\|_{C^0(\mathbf{E})} + \|D\mathbf{u}\|_{C^0(\mathbf{E})}

A cutoff function (or bump function) is a smooth function designed to smoothly transition from a value of 1 to 0.

Let \zeta: [0, \infty) \to \mathbb{R} be a smooth function such that 0 \le \zeta(s) \le 1 for all s, \zeta(s) = 1 for s \in, and \zeta(s) = 0 for s \ge 2. We can also require that its derivative is bounded, for instance, |\zeta'(s)| \le 2.

We define a cutoff function in \mathbb{R}^n centered at the origin with radius d as \beta_d(\mathbf{x}) = \zeta(|\mathbf{x}|/d). This function \beta_d \in C^\infty(\mathbb{R}^n) has the following properties:

\beta_d(\mathbf{x}) = \begin{cases} 1 & \text{if } |\mathbf{x}| \le d \\ 0 & \text{if } |\mathbf{x}| \ge 2d \end{cases}

By the chain rule, its derivative is bounded: |D\beta_d(\mathbf{x})| = |\zeta'(|\mathbf{x}|/d) \cdot \frac{\mathbf{x}}{d|\mathbf{x}|}| \le \frac{2}{d}.

This lemma is a key construction step. It demonstrates that we can create a globally defined map that is identical to the true time-one map \varphi_1 in a small neighborhood of the equilibrium, while globally being a small C^1 perturbation of the linear time-one map e^{\mathbf{A}}.

Lemma 3: Let the setup be as in Lemma 2. For any given tolerance \alpha > 0, there exists a radius d_2 > 0 and a function \mathbf{p} \in C^1(\mathbb{R}^n, \mathbb{R}^n) such that:

\mathbf{p}(\mathbf{x}_0) = \mathbf{0} and the derivative at the fixed point is zero, D\mathbf{p}(\mathbf{x}_0) = \mathbf{0},
the global C^1 norm of the perturbation is controlled by the tolerance: \|\mathbf{p}\|_{C^1(\mathbb{R}^n)} < \alpha,
in the neighborhood \mathbf{D}_{d_2}(\mathbf{x}_0), the time-one map of the nonlinear flow is given by the linear map plus the perturbation:

\varphi_1(\mathbf{x}) = \varphi(1, \mathbf{x}) = e^{\mathbf{A}}(\mathbf{x} - \mathbf{x}_0) + \mathbf{p}(\mathbf{x}), \quad \forall \mathbf{x} \in \mathbf{D}_{d_2}(\mathbf{x}_0)

Proof: again, we assume without loss of generality that \mathbf{x}_0 = \mathbf{0}. We denote the linear time-one map by \mathbf{B} = e^{\mathbf{A}}.

The derivative of the flow with respect to the initial condition, \mathbf{L}(t, \mathbf{x}) = D_{\mathbf{x}}\varphi(t, \mathbf{x}), satisfies the first variational equation:

\begin{cases} \dot{\mathbf{L}}(t, \mathbf{x}) = Df(\varphi(t, \mathbf{x}))\mathbf{L}(t, \mathbf{x}) \\ \mathbf{L}(0, \mathbf{x}) = \mathbf{I} \end{cases}

At the fixed point \mathbf{x}=\mathbf{0}, we have \varphi(t, \mathbf{0})=\mathbf{0}.

The variational equation simplifies to:

\dot{\mathbf{L}}(t, \mathbf{0}) = Df(\mathbf{0})\mathbf{L}(t, \mathbf{0}) = \mathbf{A}\mathbf{L}(t, \mathbf{0})

with the initial condition \mathbf{L}(0, \mathbf{0}) = \mathbf{I}. The unique solution is \mathbf{L}(t, \mathbf{0}) = e^{t\mathbf{A}}.

Evaluating at t=1 gives the derivative of the time-one map at the origin:

D\varphi_1(\mathbf{0}) = \mathbf{L}(1, \mathbf{0}) = e^{\mathbf{A}} = \mathbf{B}

Now, consider the function representing the nonlinear part of the time-one map, \mathbf{g}(\mathbf{x}) = \varphi_1(\mathbf{x}) - \mathbf{B}\mathbf{x}. From the properties of \varphi_1, we know that \mathbf{g}(\mathbf{0}) = \varphi_1(\mathbf{0}) - \mathbf{B}\mathbf{0} = \mathbf{0} - \mathbf{0} = \mathbf{0}. Furthermore, its derivative at the origin is D\mathbf{g}(\mathbf{0}) = D\varphi_1(\mathbf{0}) - \mathbf{B} = \mathbf{B} - \mathbf{B} = \mathbf{0}.

We construct the global perturbation \mathbf{p}(\mathbf{x}) using the cutoff function \beta_{d_2} from the preliminary definitions:

\mathbf{p}(\mathbf{x}) = \beta_{d_2}(\mathbf{x}) \mathbf{g}(\mathbf{x}) = \beta_{d_2}(\mathbf{x}) (\varphi_1(\mathbf{x}) - \mathbf{B}\mathbf{x})

By construction, for \mathbf{x} \in \mathbf{D}_{d_2}(\mathbf{0}), we have \beta_{d_2}(\mathbf{x})=1, so \mathbf{p}(\mathbf{x}) = \varphi_1(\mathbf{x}) - \mathbf{B}\mathbf{x}, which rearranges to the third claim of the lemma. The function \mathbf{p}(\mathbf{x}) is identically zero for |\mathbf{x}| \ge 2d_2.

The main task is to show that for any \alpha > 0, we can choose d_2 small enough so that \|\mathbf{p}\|_{C^1} < \alpha.

Since \mathbf{g}(\mathbf{0})=\mathbf{0} and D\mathbf{g}(\mathbf{0})=\mathbf{0} and both are continuous, for any \eta > 0, there exists a radius \delta_\eta > 0 such that for all \mathbf{x} with |\mathbf{x}| \le \delta_\eta, we have:

|\mathbf{g}(\mathbf{x})| \le \eta |\mathbf{x}| \quad \text{and} \quad \|D\mathbf{g}(\mathbf{x})\| \le \eta

Let us fix a target tolerance \alpha > 0. We choose \eta = \alpha/6. This determines the required radius \delta_\eta. We then choose our construction radius d_2 to be smaller than both \delta_\eta/2 and 1/2. With this choice, the support of \mathbf{p} (where it can be non-zero) is the ball \mathbf{D}_{2d_2}, which is contained within \mathbf{D}_{\delta_\eta}. Therefore, the bounds on \mathbf{g} and D\mathbf{g} hold throughout the support of \mathbf{p}.

We now estimate the C^1 norm of \mathbf{p}:

C^0 norm:

\|\mathbf{p}\|_{C^0} = \sup_{\mathbf{x} \in \mathbb{R}^n} |\mathbf{p}(\mathbf{x})| = \sup_{|\mathbf{x}| \le 2d_2} |\beta_{d_2}(\mathbf{x})\mathbf{g}(\mathbf{x})| \le \sup_{|\mathbf{x}| \le 2d_2} |\mathbf{g}(\mathbf{x})| \le \sup_{|\mathbf{x}| \le 2d_2} \eta|\mathbf{x}| = 2d_2\eta

**C^0 norm of the derivative, the derivative of \mathbf{p} is found using the product rule:

D\mathbf{p}(\mathbf{x}) = (D\beta_{d_2}(\mathbf{x}))\mathbf{g}(\mathbf{x})^T + \beta_{d_2}(\mathbf{x})D\mathbf{g}(\mathbf{x})

For simplicity, we use an operator norm bound:

\|D\mathbf{p}(\mathbf{x})\| \le \|D\beta_{d_2}(\mathbf{x})\| |\mathbf{g}(\mathbf{x})| + |\beta_{d_2}(\mathbf{x})| \|D\mathbf{g}(\mathbf{x})\|

Using our bounds, |D\beta_{d_2}| \le 2/d_2, |\mathbf{g}(\mathbf{x})| \le \eta|\mathbf{x}| \le 2d_2\eta, and \|D\mathbf{g}\| \le \eta:

\|D\mathbf{p}\|_{C^0} = \sup_{\mathbf{x} \in \mathbb{R}^n} \|D\mathbf{p}(\mathbf{x})\| \le \left(\frac{2}{d_2}\right)(2d_2\eta) + (1)(\eta) = 4\eta + \eta = 5\eta

Combining these estimates gives the C^1 norm:

\|\mathbf{p}\|_{C^1} = \|\mathbf{p}\|_{C^0} + \|D\mathbf{p}\|_{C^0} \le 2d_2\eta + 5\eta = (2d_2 + 5)\eta

Since we chose d_2 \le 1/2, we have 2d_2+5 \le 1+5 = 6. Therefore:

\|\mathbf{p}\|_{C^1} \le 6\eta = 6(\alpha/6) = \alpha

We have shown that for any given \alpha > 0, we can construct a function \mathbf{p} satisfying all the conditions of the lemma. This completes the proof.

Invertibility of perturbed linear maps

This lemma establishes that a linear map, when perturbed by a sufficiently small nonlinear function in the C^1 sense, retains its invertibility and, in fact, becomes a global diffeomorphism.

Lemma 4: Let \mathbf{B} be an invertible n \times n real matrix. Let \mathbf{p}: \mathbb{R}^n \to \mathbb{R}^n be a function of class C^1 such that \mathbf{p}(\mathbf{0}) = \mathbf{0}. Consider the map \mathbf{T}: \mathbb{R}^n \to \mathbb{R}^n defined by:

\mathbf{T}(\mathbf{x}) = \mathbf{B}\mathbf{x} + \mathbf{p}(\mathbf{x})

Suppose the C^1 norm of the perturbation \mathbf{p} is bounded by a constant \gamma that satisfies the condition

\|\mathbf{p}\|_{C^1(\mathbb{R}^n)} \le \gamma \quad \text{with} \quad \gamma < \frac{1}{\|\mathbf{B}^{-1}\|}

For the clarity of the proof, we will often use the specific, more restrictive choice \gamma = \frac{1}{2\|\mathbf{B}^{-1}\|}.

Under this condition, the map \mathbf{T} is a C^1-diffeomorphism of \mathbb{R}^n onto itself. Furthermore, its inverse, \mathbf{T}^{-1}, is globally Lipschitz continuous with a Lipschitz constant bounded by 2\|\mathbf{B}^{-1}\|.

Proof: the proof is organized into two main parts. First, we establish that \mathbf{T} is a homeomorphism by showing it is a bijection with a continuous inverse. This is accomplished using the Banach Fixed-Point Theorem. Second, we prove that the inverse is of class C^1 by invoking the Inverse Function Theorem, which requires showing that the derivative of \mathbf{T} is invertible everywhere.

Our immediate goal is to show that for any given \mathbf{y} \in \mathbb{R}^n, the equation \mathbf{T}(\mathbf{x}) = \mathbf{y} has a unique solution for \mathbf{x}. We can rearrange the equation as:

\mathbf{B}\mathbf{x} = \mathbf{y} - \mathbf{p}(\mathbf{x})

Since \mathbf{B} is invertible, we can write this as a fixed-point problem:

\mathbf{x} = \mathbf{B}^{-1}(\mathbf{y} - \mathbf{p}(\mathbf{x}))

For a fixed \mathbf{y}, let us define the operator \mathbf{g}_{\mathbf{y}}: \mathbb{R}^n \to \mathbb{R}^n by \mathbf{g}_{\mathbf{y}}(\mathbf{x}) = \mathbf{B}^{-1}(\mathbf{y} - \mathbf{p}(\mathbf{x})). A solution to our problem is a fixed point of \mathbf{g}_{\mathbf{y}}. We will now show that \mathbf{g}_{\mathbf{y}} is a contraction mapping on the complete metric space (\mathbb{R}^n, |\cdot|).

First, we must establish that the function \mathbf{p} is Lipschitz continuous. The condition \|\mathbf{p}\|_{C^1(\mathbb{R}^n)} \le \gamma implies that the norm of its derivative is uniformly bounded: \|D\mathbf{p}(\mathbf{z})\| \le \gamma for all \mathbf{z} \in \mathbb{R}^n. By the mean value theorem for vector-valued functions, for any two points \mathbf{x}_1, \mathbf{x}_2 \in \mathbb{R}^n, we have:

|\mathbf{p}(\mathbf{x}_1) - \mathbf{p}(\mathbf{x}_2)| \le \left(\sup_{s \in} \|D\mathbf{p}(\mathbf{x}_2 + s(\mathbf{x}_1 - \mathbf{x}_2))\|\right) |\mathbf{x}_1 - \mathbf{x}_2| \le \gamma |\mathbf{x}_1 - \mathbf{x}_2|

Now we check the contraction property for \mathbf{g}_{\mathbf{y}}. Let \mathbf{x}_1, \mathbf{x}_2 \in \mathbb{R}^n:

\begin{aligned} |\mathbf{g}_{\mathbf{y}}(\mathbf{x}_1) - \mathbf{g}_{\mathbf{y}}(\mathbf{x}_2)| &= |\mathbf{B}^{-1}(\mathbf{y} - \mathbf{p}(\mathbf{x}_1)) - \mathbf{B}^{-1}(\mathbf{y} - \mathbf{p}(\mathbf{x}_2))| \\ &= |\mathbf{B}^{-1}(\mathbf{p}(\mathbf{x}_2) - \mathbf{p}(\mathbf{x}_1))| \\ &\le \|\mathbf{B}^{-1}\| |\mathbf{p}(\mathbf{x}_1) - \mathbf{p}(\mathbf{x}_2)| \\ &\le \|\mathbf{B}^{-1}\| \gamma |\mathbf{x}_1 - \mathbf{x}_2| \end{aligned}

Using our specific choice of \gamma = \frac{1}{2\|\mathbf{B}^{-1}\|}, the Lipschitz constant for \mathbf{g}_{\mathbf{y}} is \|\mathbf{B}^{-1}\|\gamma = 1/2. Since this constant is less than 1, \mathbf{g}_{\mathbf{y}} is a contraction mapping.

By the Banach fixed-point theorem, for every \mathbf{y} \in \mathbb{R}^n, the map \mathbf{g}_{\mathbf{y}} has a unique fixed point.

This establishes the existence of a unique solution \mathbf{x} for every \mathbf{y}, so the function \mathbf{T} is a bijection. We can define the inverse map \mathbf{T}^{-1}: \mathbb{R}^n \to \mathbb{R}^n that assigns to each \mathbf{y} its unique preimage \mathbf{x}.

To show that \mathbf{T} is a homeomorphism, we must also show that \mathbf{T}^{-1} is continuous. We prove a stronger result: \mathbf{T}^{-1} is globally Lipschitz continuous. Let \mathbf{y}_1, \mathbf{y}_2 \in \mathbb{R}^n, and let their preimages be \mathbf{x}_1 = \mathbf{T}^{-1}(\mathbf{y}_1) and \mathbf{x}_2 = \mathbf{T}^{-1}(\mathbf{y}_2). From the definition of \mathbf{T}, we have:

\mathbf{y}_1 - \mathbf{y}_2 = (\mathbf{B}\mathbf{x}_1 + \mathbf{p}(\mathbf{x}_1)) - (\mathbf{B}\mathbf{x}_2 + \mathbf{p}(\mathbf{x}_2)) = \mathbf{B}(\mathbf{x}_1 - \mathbf{x}_2) + (\mathbf{p}(\mathbf{x}_1) - \mathbf{p}(\mathbf{x}_2))

Rearranging and applying \mathbf{B}^{-1}:

\mathbf{x}_1 - \mathbf{x}_2 = \mathbf{B}^{-1}(\mathbf{y}_1 - \mathbf{y}_2) - \mathbf{B}^{-1}(\mathbf{p}(\mathbf{x}_1) - \mathbf{p}(\mathbf{x}_2))

Taking norms and using the triangle inequality:

\begin{aligned} |\mathbf{x}_1 - \mathbf{x}_2| &\le \|\mathbf{B}^{-1}\| |\mathbf{y}_1 - \mathbf{y}_2| + \|\mathbf{B}^{-1}\| |\mathbf{p}(\mathbf{x}_1) - \mathbf{p}(\mathbf{x}_2)| \\ &\le \|\mathbf{B}^{-1}\| |\mathbf{y}_1 - \mathbf{y}_2| + \|\mathbf{B}^{-1}\| \gamma |\mathbf{x}_1 - \mathbf{x}_2| \\ &\le \|\mathbf{B}^{-1}\| |\mathbf{y}_1 - \mathbf{y}_2| + \frac{1}{2} |\mathbf{x}_1 - \mathbf{x}_2| \end{aligned}

Subtracting \frac{1}{2}|\mathbf{x}_1 - \mathbf{x}_2| from both sides yields:

\frac{1}{2}|\mathbf{x}_1 - \mathbf{x}_2| \le \|\mathbf{B}^{-1}\| |\mathbf{y}_1 - \mathbf{y}_2|

Finally, we arrive at the Lipschitz condition for \mathbf{T}^{-1}:

|\mathbf{T}^{-1}(\mathbf{y}_1) - \mathbf{T}^{-1}(\mathbf{y}_2)| = |\mathbf{x}_1 - \mathbf{x}_2| \le 2\|\mathbf{B}^{-1}\| |\mathbf{y}_1 - \mathbf{y}_2|

This shows that \mathbf{T}^{-1} is globally Lipschitz continuous, which implies it is continuous. Since \mathbf{T} is a continuous bijection with a continuous inverse, it is a homeomorphism.

To complete the proof, we must show that \mathbf{T}^{-1} is of class C^1. The Inverse Function Theorem states that if the derivative \mathbf{DT}(\mathbf{x}) is invertible at a point \mathbf{x}, then \mathbf{T} is a local C^1-diffeomorphism in a neighborhood of \mathbf{x}. Our strategy is to show that \mathbf{DT}(\mathbf{x}) is invertible for all \mathbf{x} \in \mathbb{R}^n.

The derivative of \mathbf{T}(\mathbf{x}) = \mathbf{B}\mathbf{x} + \mathbf{p}(\mathbf{x}) is:

\mathbf{DT}(\mathbf{x}) = \mathbf{B} + D\mathbf{p}(\mathbf{x})

We can factor out the invertible matrix \mathbf{B}:

\mathbf{DT}(\mathbf{x}) = \mathbf{B}(\mathbf{I} + \mathbf{B}^{-1}D\mathbf{p}(\mathbf{x}))

Since \mathbf{B} is invertible, the invertibility of \mathbf{DT}(\mathbf{x}) is equivalent to the invertibility of the matrix factor (\mathbf{I} + \mathbf{B}^{-1}D\mathbf{p}(\mathbf{x})). Let us denote \mathbf{M}(\mathbf{x}) = \mathbf{B}^{-1}D\mathbf{p}(\mathbf{x}).

We can show that \mathbf{I} + \mathbf{M}(\mathbf{x}) is invertible by showing that the norm of \mathbf{M}(\mathbf{x}) is strictly less than 1:

\|\mathbf{M}(\mathbf{x})\| = \|\mathbf{B}^{-1}D\mathbf{p}(\mathbf{x})\| \le \|\mathbf{B}^{-1}\| \|D\mathbf{p}(\mathbf{x})\|

From the hypothesis, we have \|D\mathbf{p}(\mathbf{x})\| \le \|\mathbf{p}\|_{C^1} \le \gamma. Therefore,

\|\mathbf{M}(\mathbf{x})\| \le \|\mathbf{B}^{-1}\| \gamma = \|\mathbf{B}^{-1}\| \frac{1}{2\|\mathbf{B}^{-1}\|} = \frac{1}{2}

Since \|\mathbf{M}(\mathbf{x})\| \le 1/2 < 1, the matrix \mathbf{I} + \mathbf{M}(\mathbf{x}) is invertible. Its inverse can be expressed by the absolutely convergent Neumann series:

(\mathbf{I} + \mathbf{M}(\mathbf{x}))^{-1} = \sum_{k=0}^\infty (-\mathbf{M}(\mathbf{x}))^k

This confirms that \mathbf{DT}(\mathbf{x}) is invertible for every \mathbf{x} \in \mathbb{R}^n.

Since \mathbf{T} is a homeomorphism (a global bijection) and its derivative is invertible everywhere, it is a global C^1-diffeomorphism.

This implies that its inverse, \mathbf{T}^{-1}, is also of class C^1. The derivative of the inverse is given by the formula \mathbf{D}\mathbf{T}^{-1}(\mathbf{y}) = [\mathbf{DT}(\mathbf{T}^{-1}(\mathbf{y}))]^{-1}. This completes the proof of the lemma.

Theorem statement and proof strategy

The theorem asserts that near a hyperbolic fixed point, the dynamics of a nonlinear system are qualitatively identical to those of its linearization.

Theorem: Let \mathbf{E} \subset \mathbb{R}^n be an open set, \mathbf{x}_0 \in \mathbf{E} a point, and f \in C^1(\mathbf{E}, \mathbb{R}^n) a vector field such that f(\mathbf{x}_0) = \mathbf{0}. Let \mathbf{A} = Df(\mathbf{x}_0) be the Jacobian matrix at the fixed point. If \mathbf{A} is hyperbolic, then there exist open neighborhoods \mathbf{U} of \mathbf{x}_0 and \mathbf{V} of the origin in \mathbb{R}^n, and a homeomorphism \mathbf{H}: \mathbf{U} \to \mathbf{V} that establishes a topological equivalence between the nonlinear flow \varphi(t, \mathbf{x}) and the linear flow e^{t\mathbf{A}}. This equivalence is expressed by the relation:

\mathbf{H}(\varphi(t, \mathbf{x})) = e^{t\mathbf{A}}\mathbf{H}(\mathbf{x})

This relation must hold for all \mathbf{x} \in \mathbf{U} and for all times t such that the trajectory \varphi(t, \mathbf{x}) remains within \mathbf{U}.

A significant consequence is that the local topological classification of hyperbolic fixed points depends only on the dimension of their stable and unstable manifolds.

Proof Strategy: the proof’s core idea is to first establish a topological conjugacy between the time-one maps of the nonlinear and linear systems, \varphi_1(\mathbf{x}) and e^{\mathbf{A}}, respectively. Once this conjugacy is constructed, it is then extended, via an averaging procedure, to a full flow equivalence for continuous time.

We will proceed under the assumption that the fixed point is at the origin, \mathbf{x}_0 = \mathbf{0}, and that the coordinate system has been chosen according to Lemma 1, so that \mathbf{A} is block-diagonal with its stable and unstable parts decoupled. We denote the linear time-one map as \mathbf{B} = e^{\mathbf{A}}. From Lemma 3, we have constructed a global map \mathbf{T}(\mathbf{x}) = \mathbf{B}\mathbf{x} + \mathbf{p}(\mathbf{x}) which coincides with the true nonlinear time-one map \varphi_1(\mathbf{x}) in a neighborhood of the origin.

Our central goal is to find a homeomorphism \mathbf{H}: \mathbb{R}^n \to \mathbb{R}^n that solves the conjugacy equation:

\mathbf{H} \circ \mathbf{T} = \mathbf{B} \circ \mathbf{H}

Construction of the conjugacy candidate

This lemma provides the core construction of the conjugacy map by solving the functional equation above.

Lemma 5: Let \mathbf{A} be a hyperbolic matrix and let \mathbf{B} = e^{\mathbf{A}}. Let \mathbf{p} \in C^1(\mathbb{R}^n, \mathbb{R}^n) be a perturbation satisfying \mathbf{p}(\mathbf{0})=\mathbf{0} and having a sufficiently small C^1 norm, \|\mathbf{p}\|_{C^1} < \gamma, where \gamma is small enough to satisfy the conditions of Lemma 4.

Let \mathbf{T}(\mathbf{x}) = \mathbf{B}\mathbf{x} + \mathbf{p}(\mathbf{x}). Then there exists a unique bounded and continuous function \mathbf{h}: \mathbb{R}^n \to \mathbb{R}^n such that the map \mathbf{H}(\mathbf{x}) = \mathbf{x} + \mathbf{h}(\mathbf{x}) satisfies the conjugacy equation:

\mathbf{H}(\mathbf{T}(\mathbf{x})) = \mathbf{B}\mathbf{H}(\mathbf{x}) \quad \forall \mathbf{x} \in \mathbb{R}^n

Proof: we seek a solution for the unknown function \mathbf{h} within the Banach space of all bounded continuous functions from \mathbb{R}^n to \mathbb{R}^n, which we denote by \mathbf{X} = C_b(\mathbb{R}^n, \mathbb{R}^n), equipped with the supremum norm. The proof leverages the Banach fixed-point theorem.

The first step is the derivation of the functional equation for \mathbf{h}.

We substitute the proposed form of the conjugacy, \mathbf{H}(\mathbf{x}) = \mathbf{x} + \mathbf{h}(\mathbf{x}), into the conjugacy equation \mathbf{H}(\mathbf{T}(\mathbf{x})) = \mathbf{B}\mathbf{H}(\mathbf{x}).

The left-hand side becomes:

\mathbf{H}(\mathbf{T}(\mathbf{x})) = \mathbf{T}(\mathbf{x}) + \mathbf{h}(\mathbf{T}(\mathbf{x})) = (\mathbf{B}\mathbf{x} + \mathbf{p}(\mathbf{x})) + \mathbf{h}(\mathbf{T}(\mathbf{x}))

The right-hand side becomes:

\mathbf{B}\mathbf{H}(\mathbf{x}) = \mathbf{B}(\mathbf{x} + \mathbf{h}(\mathbf{x})) = \mathbf{B}\mathbf{x} + \mathbf{B}\mathbf{h}(\mathbf{x})

Equating the two expressions and simplifying, we obtain a functional equation for \mathbf{h}:

\begin{aligned} \mathbf{B}\mathbf{x} + \mathbf{p}(\mathbf{x}) + \mathbf{h}(\mathbf{T}(\mathbf{x})) &= \mathbf{B}\mathbf{x} + \mathbf{B}\mathbf{h}(\mathbf{x}) \\ \mathbf{h}(\mathbf{T}(\mathbf{x})) &= \mathbf{B}\mathbf{h}(\mathbf{x}) - \mathbf{p}(\mathbf{x}) \end{aligned}

The second step is the formulation as a fixed-point problem.

To solve this equation using a contraction mapping argument, we must rearrange it into the form \mathbf{h} = \mathcal{G}(\mathbf{h}) where \mathcal{G} is a contraction operator. The challenge is that \mathbf{B} acts as a contraction on the stable subspace \mathbf{E}_s but as an expansion on the unstable subspace \mathbf{E}_u. We must treat these components separately.

Let \mathbf{h}(\mathbf{x}) = \mathbf{h}_s(\mathbf{x}) + \mathbf{h}_u(\mathbf{x}) be the decomposition of \mathbf{h} into its stable and unstable components. The functional equation splits into two coupled equations:

\begin{cases} \mathbf{h}_s(\mathbf{T}(\mathbf{x})) = \mathbf{B}_s \mathbf{h}_s(\mathbf{x}) - \mathbf{p}_s(\mathbf{x}) \\ \mathbf{h}_u(\mathbf{T}(\mathbf{x})) = \mathbf{B}_u \mathbf{h}_u(\mathbf{x}) - \mathbf{p}_u(\mathbf{x}) \end{cases}

where \mathbf{B}_s = e^{\mathbf{A}_s} and \mathbf{B}_u = e^{\mathbf{A}_u}.

To construct a contraction, we rearrange each equation so that the function \mathbf{h} is acted upon by a contracting linear operator.

For the stable component, \mathbf{B}_s is a contraction. We solve for \mathbf{h}_s(\mathbf{x}) by evaluating the equation at \mathbf{T}^{-1}(\mathbf{x}), which exists and is of class C^1 by Lemma 4:

\mathbf{h}_s(\mathbf{x}) = \mathbf{B}_s \mathbf{h}_s(\mathbf{T}^{-1}(\mathbf{x})) - \mathbf{p}_s(\mathbf{T}^{-1}(\mathbf{x}))

For the unstable component, \mathbf{B}_u is an expansion, but its inverse \mathbf{B}_u^{-1} is a contraction. We solve the second equation for \mathbf{h}_u(\mathbf{x}) directly:

\mathbf{h}_u(\mathbf{x}) = \mathbf{B}_u^{-1} \mathbf{h}_u(\mathbf{T}(\mathbf{x})) + \mathbf{B}_u^{-1} \mathbf{p}_u(\mathbf{x})

These two rearranged equations define an operator \mathcal{G} on the space \mathbf{X} of bounded continuous functions. For any \mathbf{h} \in \mathbf{X}, we define \mathcal{G}(\mathbf{h}) as the function whose stable and unstable components are:

\begin{aligned} (\mathcal{G}(\mathbf{h}))_s(\mathbf{x}) &:= \mathbf{B}_s \mathbf{h}_s(\mathbf{T}^{-1}(\mathbf{x})) - \mathbf{p}_s(\mathbf{T}^{-1}(\mathbf{x})) \\ (\mathcal{G}(\mathbf{h}))_u(\mathbf{x}) &:= \mathbf{B}_u^{-1} \mathbf{h}_u(\mathbf{T}(\mathbf{x})) + \mathbf{B}_u^{-1} \mathbf{p}_u(\mathbf{x}) \end{aligned}

A solution to our original problem is a fixed point of this operator \mathcal{G}.

The third step is the verification of the contraction property.

We define the norm on the Banach space \mathbf{X} as the supremum norm, \|\mathbf{h}\|_\infty = \sup_{\mathbf{x} \in \mathbb{R}^n} |\mathbf{h}(\mathbf{x})|. We need to show that \mathcal{G} is a strict contraction on \mathbf{X}. Let \mathbf{h}_1, \mathbf{h}_2 \in \mathbf{X}.

Let us examine the difference \mathcal{G}(\mathbf{h}_1) - \mathcal{G}(\mathbf{h}_2). The terms involving \mathbf{p} cancel, and we are left with:

\begin{aligned} (\mathcal{G}(\mathbf{h}_1) - \mathcal{G}(\mathbf{h}_2))_s(\mathbf{x}) &= \mathbf{B}_s (\mathbf{h}_{1s} - \mathbf{h}_{2s})(\mathbf{T}^{-1}(\mathbf{x})) \\ (\mathcal{G}(\mathbf{h}_1) - \mathcal{G}(\mathbf{h}_2))_u(\mathbf{x}) &= \mathbf{B}_u^{-1} (\mathbf{h}_{1u} - \mathbf{h}_{2u})(\mathbf{T}(\mathbf{x})) \end{aligned}

Now, we take norms. From Lemma 1, we have the bounds \|\mathbf{B}_s\| = \|e^{\mathbf{A}_s}\| \le e^{-\alpha} and \|\mathbf{B}_u^{-1}\| = \|e^{-\mathbf{A}_u}\| \le e^{-\alpha} for some \alpha > 0.

For the stable component:

\begin{aligned} |(\mathcal{G}(\mathbf{h}_1) - \mathcal{G}(\mathbf{h}_2))_s(\mathbf{x})| &\le \|\mathbf{B}_s\| |(\mathbf{h}_{1s} - \mathbf{h}_{2s})(\mathbf{T}^{-1}(\mathbf{x}))| \\ &\le e^{-\alpha} |(\mathbf{h}_{1s} - \mathbf{h}_{2s})(\mathbf{T}^{-1}(\mathbf{x}))| \end{aligned}

Taking the supremum over all \mathbf{x} \in \mathbb{R}^n on both sides, we get:

\|(\mathcal{G}(\mathbf{h}_1) - \mathcal{G}(\mathbf{h}_2))_s\|_\infty \le e^{-\alpha} \|\mathbf{h}_{1s} - \mathbf{h}_{2s}\|_\infty

Similarly, for the unstable component:

\|(\mathcal{G}(\mathbf{h}_1) - \mathcal{G}(\mathbf{h}_2))_u\|_\infty \le e^{-\alpha} \|\mathbf{h}_{1u} - \mathbf{h}_{2u}\|_\infty

Combining these, we can bound the norm of the difference:

\begin{aligned} \|\mathcal{G}(\mathbf{h}_1) - \mathcal{G}(\mathbf{h}_2)\|_\infty &= \sup_{\mathbf{x}} |\mathcal{G}(\mathbf{h}_1)(\mathbf{x}) - \mathcal{G}(\mathbf{h}_2)(\mathbf{x})| \\ &\le \sup_{\mathbf{x}} \left( |(\dots)_s(\mathbf{x})| + |(\dots)_u(\mathbf{x})| \right) \\ &\le e^{-\alpha} (\|\mathbf{h}_{1s} - \mathbf{h}_{2s}\|_\infty + \|\mathbf{h}_{1u} - \mathbf{h}_{2u}\|_\infty) \approx e^{-\alpha} \|\mathbf{h}_1 - \mathbf{h}_2\|_\infty \end{aligned}

Using a norm such as \|\mathbf{h}\| = \max(\|\mathbf{h}_s\|_\infty, \|\mathbf{h}_u\|_\infty) makes the final step exact:

\|\mathcal{G}(\mathbf{h}_1) - \mathcal{G}(\mathbf{h}_2)\| \le e^{-\alpha} \|\mathbf{h}_1 - \mathbf{h}_2\|

Since \mathbf{A} is hyperbolic, \alpha > 0, which implies e^{-\alpha} < 1. Therefore, the operator \mathcal{G} is a strict contraction on the Banach space \mathbf{X}.

By the Banach Fixed-Point Theorem, there exists a unique fixed point \mathbf{h} \in \mathbf{X} such that \mathcal{G}(\mathbf{h}) = \mathbf{h}. This unique function \mathbf{h} is bounded and continuous, and it satisfies the functional equation. Therefore, the map \mathbf{H}(\mathbf{x}) = \mathbf{x} + \mathbf{h}(\mathbf{x}) is a continuous map that satisfies the conjugacy equation, and it is unique within the class of maps of the form \mathbf{I} + \mathbf{h} where \mathbf{h} is bounded and continuous. This completes the proof of the lemma.

Generalized conjugacy construction

This lemma generalizes the result of Lemma 5. Instead of finding a conjugacy between a perturbed map \mathbf{T} and a purely linear map \mathbf{B}, it seeks a conjugacy between two different perturbed maps, \mathbf{T} and \mathbf{S}. This added generality is a necessary tool needed to prove that the map \mathbf{H} from Lemma 5 is invertible.

Lemma 6: Let \mathbf{A} be a hyperbolic matrix, \mathbf{B} = e^{\mathbf{A}}, and let \alpha > 0 be the constant from Lemma 1. Let \mathbf{p}, \mathbf{q} \in C^1(\mathbb{R}^n, \mathbb{R}^n) be two perturbation functions. Let the associated maps be \mathbf{T}(\mathbf{x}) = \mathbf{Bx} + \mathbf{p}(\mathbf{x}) and \mathbf{S}(\mathbf{x}) = \mathbf{Bx} + \mathbf{q}(\mathbf{x}). Assume that the norms of the perturbations are sufficiently small to ensure that both \mathbf{T} and \mathbf{S} are diffeomorphisms (as per Lemma 4). Specifically, assume:

\begin{aligned} & \|\mathbf{p}\|_{C^1} < \gamma_p \\ & \|\mathbf{q}\|_{C^1} < \gamma_q \end{aligned}

where \gamma_p and \gamma_q are sufficiently small constants. Then there exists a unique bounded and continuous function \mathbf{g}: \mathbb{R}^n \to \mathbb{R}^n such that the map \mathbf{G}(\mathbf{x}) = \mathbf{x} + \mathbf{g}(\mathbf{x}) satisfies the generalized conjugacy equation:

\mathbf{T}(\mathbf{G}(\mathbf{x})) = \mathbf{G}(\mathbf{S}(\mathbf{x})) Proof: the proof follows the exact same structure as the proof of Lemma 5.

We first derive the functional equation, we substitute the form \mathbf{G}(\mathbf{x}) = \mathbf{x} + \mathbf{g}(\mathbf{x}) into the conjugacy equation:

\begin{aligned} \text{LHS: } \quad \mathbf{T}(\mathbf{G}(\mathbf{x})) &= \mathbf{B}(\mathbf{G}(\mathbf{x})) + \mathbf{p}(\mathbf{G}(\mathbf{x})) = \mathbf{B}(\mathbf{x} + \mathbf{g}(\mathbf{x})) + \mathbf{p}(\mathbf{x} + \mathbf{g}(\mathbf{x})) \\ \text{RHS: } \quad \mathbf{G}(\mathbf{S}(\mathbf{x})) &= \mathbf{S}(\mathbf{x}) + \mathbf{g}(\mathbf{S}(\mathbf{x})) = (\mathbf{B}\mathbf{x} + \mathbf{q}(\mathbf{x})) + \mathbf{g}(\mathbf{S}(\mathbf{x})) \end{aligned}

Equating the two sides and simplifying yields a functional equation for \mathbf{g}:

\mathbf{B}\mathbf{g}(\mathbf{x}) - \mathbf{g}(\mathbf{S}(\mathbf{x})) = \mathbf{q}(\mathbf{x}) - \mathbf{p}(\mathbf{x} + \mathbf{g}(\mathbf{x}))

We then formulate as a fixed-point problem: as before, we rearrange this equation to define an operator on the Banach space \mathbf{X} = C_b(\mathbb{R}^n, \mathbb{R}^n).

The operator \mathcal{F} is defined by:

\begin{aligned} (\mathcal{F}(\mathbf{g}))_s(\mathbf{x}) &:= \mathbf{B}_s \mathbf{g}_s(\mathbf{S}^{-1}(\mathbf{x})) + \mathbf{p}_s(\mathbf{S}^{-1}(\mathbf{x}) + \mathbf{g}(\mathbf{S}^{-1}(\mathbf{x}))) - \mathbf{q}_s(\mathbf{S}^{-1}(\mathbf{x})) \\ (\mathcal{F}(\mathbf{g}))_u(\mathbf{x}) &:= \mathbf{B}_u^{-1} \mathbf{g}_u(\mathbf{S}(\mathbf{x})) - \mathbf{B}_u^{-1}\mathbf{p}_u(\mathbf{S}(\mathbf{x}) + \mathbf{g}(\mathbf{S}(\mathbf{x}))) + \mathbf{B}_u^{-1}\mathbf{q}_u(\mathbf{S}(\mathbf{x})) \end{aligned}

We verify the contraction property: the operator \mathcal{F} is no longer linear in \mathbf{g} due to the term \mathbf{p}(\dots + \mathbf{g}(\dots)).

However, if the Lipschitz constant of \mathbf{p} (which is bounded by \|\mathbf{p}\|_{C^1}) is sufficiently small, \mathcal{F} is still a contraction.

The Lipschitz constant of \mathcal{F} will be bounded by a quantity like e^{-\alpha} + C\|\mathbf{p}\|_{C^1} for some constant C. By requiring the norm of \mathbf{p} to be small enough, this can be made less than 1.

The Banach fixed-point Theorem again guarantees the existence of a unique bounded and continuous solution \mathbf{g}, which defines the desired map \mathbf{G}.

The conjugacy is a homeomorphism

Here we use the machinery of Lemma 6 to prove that the map \mathbf{H}, whose existence and uniqueness were established in Lemma 5, is invertible with a continuous inverse.

Lemma 7: Let \mathbf{A} be a hyperbolic matrix and let the conditions on the perturbation \mathbf{p} and the map \mathbf{T}(\mathbf{x}) = \mathbf{B}\mathbf{x} + \mathbf{p}(\mathbf{x}) be as in Lemma 5, with norm conditions sufficient for both Lemma 5 and Lemma 6 to hold. Then the unique continuous map \mathbf{H}(\mathbf{x}) = \mathbf{x} + \mathbf{h}(\mathbf{x}) satisfying the conjugacy equation

\mathbf{H}(\mathbf{T}(\mathbf{x})) = \mathbf{B}\mathbf{H}(\mathbf{x})

is a homeomorphism of \mathbb{R}^n.

Proof: the proof proceeds by explicitly constructing a two-sided continuous inverse for \mathbf{H}. Let this inverse candidate be denoted by \mathbf{K}.

The first step is the construction of the inverse Candidate \mathbf{K}.

We seek a continuous map \mathbf{K} that “inverts” the conjugacy relation of \mathbf{H}. That is, we look for a map \mathbf{K} that intertwines \mathbf{T} and \mathbf{B} in the opposite direction:

\mathbf{T}(\mathbf{K}(\mathbf{y})) = \mathbf{K}(\mathbf{B}\mathbf{y})

This is an instance of the generalized conjugacy problem of Lemma 6, \mathbf{T} \circ \mathbf{G} = \mathbf{G} \circ \mathbf{S}. In this specific case, the maps are:

\mathbf{T}(\mathbf{x}) = \mathbf{B}\mathbf{x} + \mathbf{p}(\mathbf{x})
\mathbf{S}(\mathbf{y}) = \mathbf{B}\mathbf{y}. This is a map with perturbation \mathbf{q}(\mathbf{y}) = \mathbf{0}.

Since the perturbations \mathbf{p} and \mathbf{q}=\mathbf{0} satisfy the required norm conditions, Lemma 6 guarantees the existence of a unique continuous map \mathbf{K}(\mathbf{y}) = \mathbf{y} + \mathbf{k}(\mathbf{y}), where \mathbf{k} is bounded and continuous, that satisfies \mathbf{T}(\mathbf{K}(\mathbf{y})) = \mathbf{K}(\mathbf{B}\mathbf{y}).

The second step requires to proving that \mathbf{H} \circ \mathbf{K} is the identity.

We now compose the two maps and show that the result is the identity map. Let \mathbf{G} = \mathbf{H} \circ \mathbf{K}. We will determine the functional equation that \mathbf{G} satisfies. For any \mathbf{y} \in \mathbb{R}^n:

\begin{aligned} \mathbf{G}(\mathbf{B}\mathbf{y}) &= (\mathbf{H} \circ \mathbf{K})(\mathbf{B}\mathbf{y}) = \mathbf{H}(\mathbf{K}(\mathbf{B}\mathbf{y})) \\ &= \mathbf{H}(\mathbf{T}(\mathbf{K}(\mathbf{y}))) \\ &= \mathbf{B}(\mathbf{H}(\mathbf{K}(\mathbf{y}))) \\ &= \mathbf{B}(\mathbf{G}(\mathbf{y})) \end{aligned}

So, the composite map \mathbf{G} satisfies the conjugacy equation \mathbf{G}(\mathbf{B}\mathbf{y}) = \mathbf{B}\mathbf{G}(\mathbf{y}). This equation can be written as \mathbf{G} \circ \mathbf{B} = \mathbf{B} \circ \mathbf{G}. This is an instance of the conjugacy problem from Lemma 5 where the map \mathbf{T} is simply the linear map \mathbf{B} (i.e., the perturbation \mathbf{p} is zero).

Lemma 5 guarantees a unique solution of the form “identity plus a bounded continuous function”. Let’s check the form of our candidate solution \mathbf{G}:

\mathbf{G}(\mathbf{y}) = \mathbf{H}(\mathbf{K}(\mathbf{y})) = \mathbf{H}(\mathbf{y} + \mathbf{k}(\mathbf{y})) = (\mathbf{y} + \mathbf{k}(\mathbf{y})) + \mathbf{h}(\mathbf{y} + \mathbf{k}(\mathbf{y})) = \mathbf{y} + [\mathbf{k}(\mathbf{y}) + \mathbf{h}(\mathbf{y} + \mathbf{k}(\mathbf{y}))]

Since \mathbf{h} and \mathbf{k} are bounded continuous functions, the term in the brackets is also a bounded continuous function. So, \mathbf{G} has the required form.

However, we can easily identify another solution to \mathbf{G} \circ \mathbf{B} = \mathbf{B} \circ \mathbf{G}: the identity map, \mathbf{I}(\mathbf{y}) = \mathbf{y}. The identity map can be written as \mathbf{I}(\mathbf{y}) = \mathbf{y} + \mathbf{0}, where the zero function is bounded and continuous.

By the uniqueness part of Lemma 5, since both \mathbf{G} and \mathbf{I} are solutions of the prescribed form, they must be identical. Therefore, \mathbf{G} = \mathbf{I}, which means \mathbf{H} \circ \mathbf{K} = \mathbf{I}.

The third step is to prove that \mathbf{K} \circ \mathbf{H} is the Identity.

The argument is symmetric. Let \mathbf{G}^\prime = \mathbf{K} \circ \mathbf{H}. We determine the equation it satisfies:

\begin{aligned} \mathbf{G}^\prime(\mathbf{T}(\mathbf{x})) &= (\mathbf{K} \circ \mathbf{H})(\mathbf{T}(\mathbf{x})) = \mathbf{K}(\mathbf{H}(\mathbf{T}(\mathbf{x}))) \\ &= \mathbf{K}(\mathbf{B}\mathbf{H}(\mathbf{x})) \\ &= \mathbf{T}(\mathbf{K}(\mathbf{H}(\mathbf{x})))\\ &= \mathbf{T}(\mathbf{G}^\prime(\mathbf{x})) \end{aligned}

So, the composite map \mathbf{G}^\prime satisfies the conjugacy equation \mathbf{G}^\prime \circ \mathbf{T} = \mathbf{T} \circ \mathbf{G}^\prime. This is an instance of the generalized conjugacy from Lemma 6, where the two maps are identical: \mathbf{S} = \mathbf{T}, meaning \mathbf{q} = \mathbf{p}.

Again, Lemma 6 guarantees a unique solution of the form “identity plus bounded continuous”. Our candidate \mathbf{G}^\prime has this form. The identity map \mathbf{I} is also clearly a solution: \mathbf{I} \circ \mathbf{T} = \mathbf{T} \circ \mathbf{I}. By uniqueness, we must have \mathbf{G}^\prime = \mathbf{I}, which means \mathbf{K} \circ \mathbf{H} = \mathbf{I}.

We have constructed a map \mathbf{K} and have shown that it is a two-sided inverse for \mathbf{H}. Both \mathbf{H} and \mathbf{K} are continuous maps.

By definition, a continuous map with a continuous inverse is a homeomorphism. Therefore, \mathbf{H} is a homeomorphism of \mathbb{R}^n. This completes the proof.

From time-one conjugacy to flow equivalence

The previous lemmas have established the existence of a homeomorphism \mathbf{H} that conjugates the discrete dynamics of the time-one map \varphi_1 with its linearization \mathbf{B}=e^{\mathbf{A}}, at least in a neighborhood of the origin.

This final lemma extends this result from a discrete-time conjugacy to a continuous-time flow equivalence. This is achieved through an averaging construction.

Lemma 8: Let the radius d_2 > 0 be chosen sufficiently small in the construction of Lemma 3, such that the resulting perturbation \mathbf{p} satisfies the norm conditions required by Lemma 5 and Lemma 7. Let \mathbf{H} be the homeomorphism constructed in Lemma 5, which satisfies the conjugacy equation for the time-one map \mathbf{T} = \varphi_1 inside the neighborhood \mathbf{D}_{d_2}(\mathbf{0}):

\mathbf{H}(\varphi_1(\mathbf{x})) = e^{\mathbf{A}}\mathbf{H}(\mathbf{x}) \quad \forall \mathbf{x} \in \mathbf{D}_{d_2}(\mathbf{0})

We define a new map \mathcal{H}, for \mathbf{x} in a suitable neighborhood of the origin, via the following integral formula:

\mathcal{H}(\mathbf{x}) = \int_0^1 e^{-s\mathbf{A}} \mathbf{H}(\varphi(s, \mathbf{x})) \, \mathrm{d}s

This map \mathcal{H} establishes a local topological flow equivalence. Specifically, for all \mathbf{x} in a neighborhood \mathbf{U} \subset \mathbf{D}_{d_2}(\mathbf{0}) and for all times t for which the trajectory remains in \mathbf{U}, the following relation holds:

\mathcal{H}(\varphi(t, \mathbf{x})) = e^{t\mathbf{A}}\mathcal{H}(\mathbf{x})

Moreover, this newly constructed map \mathcal{H} is not different from our original conjugacy; they are identical in the neighborhood where they are defined, i.e., \mathcal{H}(\mathbf{x}) = \mathbf{H}(\mathbf{x}) for \mathbf{x} \in \mathbf{U}. Consequently, \mathcal{H} is also a local homeomorphism.

Proof: the proof hinges on a direct calculation that verifies the flow equivalence property. We will use the notation \varphi_t(\mathbf{x}) = \varphi(t, \mathbf{x}).

The first part is the verification of the flow equivalence property.

We aim to prove that e^{-t\mathbf{A}}\mathcal{H}(\varphi_t(\mathbf{x})) = \mathcal{H}(\mathbf{x}).

Let us start with the left-hand side and apply the definition of \mathcal{H} to the point \varphi_t(\mathbf{x}). For simplicity, we consider t \ge 0:

e^{-t\mathbf{A}}\mathcal{H}(\varphi_t(\mathbf{x})) = e^{-t\mathbf{A}} \int_0^1 e^{-s\mathbf{A}} \mathbf{H}(\varphi(s, \varphi_t(\mathbf{x}))) \, \mathrm{d}s

By the semigroup property of the flow, \varphi(s, \varphi_t(\mathbf{x})) = \varphi_{s+t}(\mathbf{x}). Combining the exponential terms, we get:

e^{-t\mathbf{A}}\mathcal{H}(\varphi_t(\mathbf{x})) = \int_0^1 e^{-(s+t)\mathbf{A}} \mathbf{H}(\varphi_{s+t}(\mathbf{x})) \, \mathrm{d}s

We perform a change of integration variable. Let \tau = s+t. Then \mathrm{d}\tau = \mathrm{d}s, and the limits of integration change from s \in to \tau \in [t, t+1]:

e^{-t\mathbf{A}}\mathcal{H}(\varphi_t(\mathbf{x})) = \int_t^{t+1} e^{-\tau\mathbf{A}} \mathbf{H}(\varphi_\tau(\mathbf{x})) \, \mathrm{d}\tau

The core trick is to relate this integral back to the original integral defining \mathcal{H}(\mathbf{x}). We use the time-one conjugacy property of \mathbf{H}. Let us write \tau = \sigma + 1. The flow can be decomposed as \varphi_{\sigma+1}(\mathbf{x}) = \varphi_1(\varphi_\sigma(\mathbf{x})).

The conjugacy relation is \mathbf{H}(\varphi_1(\mathbf{z})) = e^\mathbf{A}\mathbf{H}(\mathbf{z}), which can be rewritten as e^{-\mathbf{A}}\mathbf{H}(\varphi_1(\mathbf{z})) = \mathbf{H}(\mathbf{z}). Using this, we find:

\begin{aligned} e^{-(\sigma+1)\mathbf{A}} \mathbf{H}(\varphi_{\sigma+1}(\mathbf{x})) &= e^{-\sigma\mathbf{A}} e^{-\mathbf{A}} \mathbf{H}(\varphi_1(\varphi_\sigma(\mathbf{x}))) \\ &= e^{-\sigma\mathbf{A}} \mathbf{H}(\varphi_\sigma(\mathbf{x})) \end{aligned}

This shows that the integrand, g(\tau) = e^{-\tau\mathbf{A}} \mathbf{H}(\varphi_\tau(\mathbf{x})), is a 1-periodic function of \tau. Therefore, its integral over any interval of length 1 is the same:

\int_t^{t+1} g(\tau) \, \mathrm{d}\tau = \int_0^1 g(\tau) \, \mathrm{d}\tau

Let’s show this explicitly by splitting the integral:

\int_t^{t+1} e^{-\tau\mathbf{A}} \mathbf{H}(\varphi_\tau(\mathbf{x})) \, \mathrm{d}\tau = \int_t^1 e^{-\tau\mathbf{A}} \mathbf{H}(\varphi_\tau(\mathbf{x})) \, \mathrm{d}\tau + \int_1^{t+1} e^{-\tau\mathbf{A}} \mathbf{H}(\varphi_\tau(\mathbf{x})) \, \mathrm{d}\tau

In the second integral, let \tau = \sigma+1. The limits become \sigma \in [0, t]:

\int_t^{t+1} e^{-\tau\mathbf{A}} \mathbf{H}(\varphi_\tau(\mathbf{x})) \, \mathrm{d}\tau = \int_t^1 e^{-\tau\mathbf{A}} \mathbf{H}(\varphi_\tau(\mathbf{x})) \, \mathrm{d}\tau + \int_0^t e^{-(\sigma+1)\mathbf{A}} \mathbf{H}(\varphi_{\sigma+1}(\mathbf{x})) \, \mathrm{d}\sigma

Using the 1-periodicity of the integrand that we just proved:

\int_t^{t+1} e^{-\tau\mathbf{A}} \mathbf{H}(\varphi_\tau(\mathbf{x})) \, \mathrm{d}\tau = \int_t^1 e^{-\tau\mathbf{A}} \mathbf{H}(\varphi_\tau(\mathbf{x})) \, \mathrm{d}\tau + \int_0^t e^{-\sigma\mathbf{A}} \mathbf{H}(\varphi_\sigma(\mathbf{x})) \, \mathrm{d}\sigma

Combining these two integrals gives the integral over the full interval:

\int_t^{t+1} e^{-\tau\mathbf{A}} \mathbf{H}(\varphi_\tau(\mathbf{x})) \, \mathrm{d}\tau = \int_0^1 e^{-\tau\mathbf{A}} \mathbf{H}(\varphi_\tau(\mathbf{x})) \, \mathrm{d}\tau = \mathcal{H}(\mathbf{x})

We have successfully shown that e^{-t\mathbf{A}}\mathcal{H}(\varphi_t(\mathbf{x})) = \mathcal{H}(\mathbf{x}). Multiplying by e^{t\mathbf{A}} gives the desired flow equivalence relation:

\mathcal{H}(\varphi(t, \mathbf{x})) = e^{t\mathbf{A}}\mathcal{H}(\mathbf{x})

Finally, we need to prove the identity of \mathcal{H} and \mathbf{H}.

Now we must show that this new map \mathcal{H} is, in fact, the same as the original map \mathbf{H} in the neighborhood of interest.

Let us evaluate the flow equivalence relation at time t=1. For \mathbf{x} \in \mathbf{D}_{d_2}, the map \varphi_1(\mathbf{x}) is identical to the map \mathbf{T}(\mathbf{x}). The relation becomes:

\mathcal{H}(\mathbf{T}(\mathbf{x})) = e^{\mathbf{A}}\mathcal{H}(\mathbf{x}) = \mathbf{B}\mathcal{H}(\mathbf{x})

This shows that the map \mathcal{H} satisfies the same conjugacy equation as \mathbf{H}. Let’s check if \mathcal{H} has the form “identity plus a bounded continuous function” required for the uniqueness conclusion of Lemma 5.

\begin{aligned} \mathcal{H}(\mathbf{x}) &= \int_0^1 e^{-s\mathbf{A}} \mathbf{H}(\varphi(s, \mathbf{x})) \, \mathrm{d}s \\ &= \int_0^1 e^{-s\mathbf{A}} (\varphi(s, \mathbf{x}) + \mathbf{h}(\varphi(s, \mathbf{x}))) \, \mathrm{d}s \end{aligned}

The flow can be written as \varphi(s, \mathbf{x}) = \mathbf{x} + \int_0^s f(\varphi(\sigma, \mathbf{x}))\mathrm{d}\sigma. Substituting this and rearranging, one can show that \mathcal{H}(\mathbf{x}) = \mathbf{x} + \tilde{\mathbf{h}}(\mathbf{x}), where \tilde{\mathbf{h}} is a bounded continuous function.

Since both \mathbf{H} and \mathcal{H} satisfy the same conjugacy equation and are of the required form, the uniqueness part of Lemma 5 compels us to conclude that they are identical:

\mathcal{H}(\mathbf{x}) = \mathbf{H}(\mathbf{x})

for all \mathbf{x} in the neighborhood where \mathcal{H} is defined and the conjugacy holds.

We have constructed a map \mathcal{H} and proved that it satisfies the flow equivalence property. We have also shown that this map is identical to the map \mathbf{H} from Lemma 5. Since Lemma 7 established that \mathbf{H} is a homeomorphism, it follows that \mathcal{H} is a local homeomorphism.

This completes the construction of the local topological flow equivalence and it finalizes the proof of the Hartman-Grobman theorem.

Go to the top of the page