Solutions "General relativity - the theoretical minimum" (part I)

Solutions
General Relativity - The Theoretical Minimum

Exercise list

Lecture 1

Exercise 1.1

Exercise 1.2

Lecture 2

Exercise 2.1

Exercise page 68

Exercise page 78

Lecture 3

Exercise page 93

Exercise page 102

Exercise 3.1

Exercise 3.2

Exercise 3.3

Exercise page 116

Exercise 1.1

If we are falling freely in a uniform gravitational field, prove that we feel no gravity and that thing float around us like in the International Space Station.

Let’s consider an inertial reference frame S. Within this frame, a uniform gravitational field is described by the acceleration vector \mathbf g. For simplicity, let this field be oriented along the negative vertical axis, such that \mathbf g = -g \mathbf k, where g is the constant magnitude of gravitational acceleration.

The equation of motion for a particle of mass m subject to this field is given by Newton’s second law:

m \frac{\mathrm d^2 \mathbf r}{\mathrm d t^2} = m \mathbf g

The mass m cancels, indicating that the acceleration of any object in this field is independent of its mass:

\frac{\mathrm d^2 \mathbf r}{\mathrm d t^2} = \mathbf g

Now, let us introduce a non-inertial reference frame S^\prime that is in a state of free fall. The origin of this frame accelerates with respect to S at a rate equal to the gravitational acceleration, \mathbf A = \mathbf g.

The position vector of a particle in this new frame is \mathbf r^\prime. The relation between accelerations in the two frames is given by:

\mathbf a^\prime = \mathbf a - \mathbf A

Here, \mathbf a^\prime is the acceleration measured in the falling frame S^\prime, and \mathbf a is the acceleration of the particle as measured in the inertial frame S. For any particle subject only to gravity, its acceleration \mathbf a is equal to \mathbf g. Substituting these values into the transformation equation yields the acceleration within the freely falling frame:

\mathbf a^\prime = \mathbf g - \mathbf g = \mathbf 0

An object within the frame S^\prime experiences zero acceleration relative to the frame.

The physical sensation of weight arises not directly from the gravitational force, but from the contact forces, such as the normal force \mathbf N, that oppose it.

For a person standing on the ground in the inertial frame S, the net force is zero, so \mathbf N + m \mathbf g = \mathbf 0, which means \mathbf N = -m \mathbf g. This normal force creates internal stresses within the body that we perceive as weight. In the freely falling frame S^\prime, no such supporting force is present:

\mathbf N = \mathbf 0

Since all parts of an observer’s body, and any surrounding objects, accelerate identically, no internal stresses or relative contact forces are generated. This absence of opposing forces results in the sensation of weightlessness.

The environment inside the International Space Station (ISS) is a direct application of this principle. A common misconception is that the ISS is in a “zero-gravity” environment.

We begin by calculating the magnitude of Earth’s gravitational acceleration, which we will denote as g^\prime, at the typical orbital altitude of the ISS. The necessary physical constants are:

  • Gravitational constant: G \approx 6.674 \times 10^{-11} \, \text{N} \cdot \text{m}^2/\text{kg}^2
  • Mass of the Earth: M_{\oplus} \approx 5.972 \times 10^{24} \, \text{kg}
  • Mean radius of the Earth: R_{\oplus} \approx 6.371 \times 10^{6} \, \text{m}
  • Typical ISS altitude: h \approx 400 \, \text{km} = 4.0 \times 10^{5} \, \text{m}

The orbital radius r is the sum of the Earth’s radius and the altitude.

r = R_{\oplus} + h = 6.371 \times 10^{6} \, \text{m} + 0.4 \times 10^{6} \, \text{m} = 6.771 \times 10^{6} \, \text{m}

The gravitational acceleration g^\prime at this radius is:

g^\prime = G \frac{M_{\oplus}}{r^2}

Substituting the values:

g^\prime = (6.674 \times 10^{-11}) \frac{5.972 \times 10^{24}}{(6.771 \times 10^{6})^2} \approx 8.69 \, \text{m/s}^2

This value is approximately 88.6\% of the gravitational acceleration at the surface (g \approx 9.81 \, \text{m/s}^2), confirming that the station is well within Earth’s gravitational influence.

To formalize the sensation of weightlessness, we can analyze the situation from a non-inertial reference frame S^\prime that co-rotates with the ISS.

In this frame, one must introduce a fictitious centrifugal force, \mathbf F_{cf}, which acts radially outward. For an object of mass m inside the station, the net force \mathbf F^\prime_{net} in this rotating frame is the vector sum of the real gravitational force and this fictitious force:

\mathbf F^\prime_{net} = \mathbf F_{grav} + \mathbf F_{cf}

With the outward radial direction defined by the unit vector \mathbf u_r, the forces are:

\begin{aligned} & \mathbf F_{grav} = -G \frac{M_{\oplus} m}{r^2} \mathbf u_r = -m g^\prime \mathbf u_r \\ & \mathbf F_{cf} = +m \frac{v^2}{r} \mathbf u_r = +m a_c \mathbf u_r \end{aligned}

The net force in the co-rotating frame is then:

\mathbf F^\prime_{net} = (-m g^\prime + m a_c) \mathbf u_r

Now, we calculate the centripetal acceleration, \mathbf a_c, required to maintain a stable circular orbit at this radius. The magnitude of this acceleration depends on the orbital velocity v. The typical orbital velocity of the ISS is v \approx 7.66 \, \text{km/s} = 7.66 \times 10^3 \, \text{m/s}. The required centripetal acceleration’s magnitude is:

a_c = \frac{v^2}{r}

Substituting the values for the ISS:

a_c = \frac{(7.66 \times 10^3)^2}{6.771 \times 10^{6}} \approx 8.66 \, \text{m/s}^2

Comparing the two results, we find that g^\prime \approx a_c. The gravitational acceleration provided by the Earth at that altitude is almost identical to the centripetal acceleration required to keep the station in its orbit. This equality demonstrates that gravity is the sole force responsible for continuously altering the station’s velocity vector, pulling it into a circular path. This is the definition of free fall.

Since our calculation has shown that g^\prime \approx a_c:

\mathbf F^\prime_{net} = (-m g^\prime + m a_c) \mathbf u_r \approx \mathbf 0

The apparent force on any object within the co-moving frame of the ISS is zero. There is no net force to press an astronaut against the walls of the station, leading to the condition of persistent weightlessness experienced on board.

Exercise 1.2

It is possible to find a curved surface and a lattice of rods arranged on it that cannot be flatted out, but can change shape?

Yes, such a surface exists. The inability to be “flattened out” implies the surface has non-zero Gaussian curvature, K \neq 0 and, according to Gauss’ theorema egregium (here), the Gaussian curvature K of a regular surface S \subset \mathbb{R}^3 is an intrinsic quantity.

For a lattice of rods, this corresponds to a polyhedral surface where the sum of the angles at one or more vertices is not 2\pi.

The ability to “change shape” means the structure is flexible, capable of undergoing an isometric deformation—a continuous change in its embedding in \mathbb{R}^3 that preserves all rod lengths.

The Connelly sphere is an example of a non-convex, closed polyhedron. It is flexible, yet because its total curvature is non-zero (it is topologically a sphere), it cannot be flattened onto a plane.

Exercise 2.1

Prove that, in an orthonormal basis, equation (5) is equivalent to equation (6).

Hint: Do in two dimensions. Then - it is slightly involved - we encourage you to try do to it in any dimension.

Equation (5) is the expression of the dot product of two vectors \mathbf V and \mathbf W:

\mathbf V \cdot \mathbf W = |\mathbf V||\mathbf W|\cos(\theta)

Equation (6) is the orthonormal expression of the dot product:

\mathbf V \cdot \mathbf W = V^1 W^1 + V^2 W^2 + \dots + V^N W^N

Let’s define the basis vector as:

\mathbf e_i, \quad i = 1,\dots,N

In an orthonormal basis:

\mathbf e_i \cdot \mathbf e_j = |\mathbf e_i||\mathbf e_j|\cos(\theta_{ij}) = \delta_{ij}

For two dimensions (N=2), we have that:

\begin{aligned} & \mathbf V = V^1 \mathbf e_1 + V^2 \mathbf e_2 \\ & \mathbf W = W^1 \mathbf e_1 + W^2 \mathbf e_2 \end{aligned}

Let’s define \alpha the angle vector \mathbf V makes with \mathbf e_1 and \beta the angle the vector \mathbf W makes with \mathbf e_1. The angle \theta between \mathbf V and \mathbf W is the difference between these two angles, so \theta = \alpha - \beta.

The components of \mathbf V and \mathbf W can be expressed in terms of their magnitudes and these angles:

\begin{aligned} V^1 &= |\mathbf V| \cos(\alpha) \\ V^2 &= |\mathbf V| \sin(\alpha) \\ W^1 &= |\mathbf W| \cos(\beta) \\ W^2 &= |\mathbf W| \sin(\beta) \end{aligned}

Substituting these trigonometric expressions:

\begin{aligned} V^1 W^1 + V^2 W^2 & = (|\mathbf V| \cos(\alpha)) (|\mathbf W| \cos(\beta)) + (|\mathbf V| \sin(\alpha)) (|\mathbf W| \sin(\beta)) \\ & = |\mathbf V||\mathbf W| (\cos(\alpha) \cos(\beta) + \sin(\alpha) \sin(\beta)) \end{aligned}

The expression in the parentheses is the trigonometric identity for the cosine of the difference of two angles:

\cos(\alpha - \beta) = \cos(\alpha) \cos(\beta) + \sin(\alpha) \sin(\beta)

By substituting this identity back, we obtain:

V^1 W^1 + V^2 W^2 = |\mathbf V||\mathbf W| \cos(\alpha - \beta) = |\mathbf V||\mathbf W| \cos(\theta)

So the two definitions are equivalent.

We will extend the proof of equivalence between the geometric and algebraic definitions of the dot product to an arbitrary N-dimensional space, \mathbb{R}^N.

The core of the argument rests on the observation that any two vectors inhabit a two-dimensional subspace, a plane.

By choosing a coordinate system aligned with this plane, the N-dimensional problem simplifies to the two-dimensional case we have already proven.

Let us consider two non-zero vectors \mathbf V and \mathbf W in \mathbb{R}^N. If the vectors are collinear, meaning \mathbf W = k\mathbf V for some scalar k, the angle \theta is either 0 or \pi.

The geometric definition becomes |\mathbf V||\mathbf W|\cos(\theta) = \pm|\mathbf V||k||\mathbf V| = k|\mathbf V|^2. The algebraic definition is \sum V^i (k V^i) = k \sum (V^i)^2 = k|\mathbf V|^2. The equivalence holds in this simple case.

Assuming the vectors are not collinear, they are linearly independent and define a unique plane within the N-dimensional space.

The strategy is to construct a new orthonormal basis for \mathbb{R}^N, let’s call it \{\mathbf u_1, \mathbf u_2, \dots, \mathbf u_N\}, that is particularly suited to this plane.

We can orient this new basis as follows. First, we define the first basis vector \mathbf u_1 to be a unit vector pointing in the same direction as \mathbf V.

\mathbf u_1 = \frac{\mathbf V}{|\mathbf V|}

Next, we construct a second basis vector, \mathbf u_2, that lies within the plane spanned by \mathbf V and \mathbf W and is orthogonal to \mathbf u_1. This can be achieved using the Gram-Schmidt process. We take the component of \mathbf W that is orthogonal to \mathbf u_1 and normalize it.

\mathbf u_2 = \frac{\mathbf W - (\mathbf W \cdot \mathbf u_1)\mathbf u_1}{|\mathbf W - (\mathbf W \cdot \mathbf u_1)\mathbf u_1|}

The vectors \mathbf u_1 and \mathbf u_2 form an orthonormal basis for the plane containing \mathbf V and \mathbf W. The remaining N-2 basis vectors, \{\mathbf u_3, \dots, \mathbf u_N\}, can be chosen to be mutually orthogonal and orthogonal to the plane itself, completing the orthonormal basis for the entire space \mathbb{R}^N.

A property of the dot product is its invariance under a change of orthonormal basis.

This means the value of \mathbf V \cdot \mathbf W is the same regardless of the orthonormal coordinate system used for the calculation:

\sum_{i=1}^N V^i W^i = \sum_{j=1}^N V^{\prime j} W^{\prime j}

where V^i and W^i are the components in the original basis \{\mathbf e_i\}, and V^{\prime j} and W^{\prime j} are the components in our new, convenient basis \{\mathbf u_j\}. We will now calculate the dot product in this new basis.

Let’s find the components of \mathbf V and \mathbf W in the \{\mathbf u_j\} basis.

For vector \mathbf V, since \mathbf V is aligned with \mathbf u_1 by construction, its representation is simple:

\mathbf V = |\mathbf V| \mathbf u_1

Its components V^{\prime j} are (|\mathbf V|, 0, 0, \dots, 0).

For vector \mathbf W, since \mathbf W lies entirely in the plane spanned by \mathbf u_1 and \mathbf u_2, its components W^{\prime j} will be zero for all j > 2. The non-zero components are found by projecting \mathbf W onto the basis vectors \mathbf u_1 and \mathbf u_2:

\begin{aligned} W^{\prime 1} &= \mathbf W \cdot \mathbf u_1 = |\mathbf W||\mathbf u_1|\cos(\theta) = |\mathbf W|\cos(\theta) \\ W^{\prime 2} &= \mathbf W \cdot \mathbf u_2 = |\mathbf W||\mathbf u_2|\cos(\frac{\pi}{2}-\theta) = |\mathbf W|\sin(\theta) \end{aligned}

The angle between \mathbf W and \mathbf u_1 is \theta, and the angle between \mathbf W and \mathbf u_2 is \frac{\pi}{2}-\theta because \mathbf u_1 and \mathbf u_2 are orthogonal.

Now we compute the sum of the products of these new components:

\mathbf V \cdot \mathbf W = \sum_{j=1}^N V^{\prime j} W^{\prime j} = V^{\prime 1} W^{\prime 1} + V^{\prime 2} W^{\prime 2} + \sum_{j=3}^N V^{\prime j} W^{\prime j}

Substituting the component values we found:

\mathbf V \cdot \mathbf W = (|\mathbf V|) (|\mathbf W|\cos(\theta)) + (0)(|\mathbf W|\sin(\theta)) + \sum_{j=3}^N (0)(0)

The expression simplifies considerably:

\mathbf V \cdot \mathbf W = |\mathbf V||\mathbf W|\cos(\theta)

We have shown that the algebraic component-sum formula, when calculated in a specially chosen basis, yields the geometric formula.

Because the dot product is invariant to the choice of orthonormal basis, this result holds for any orthonormal basis.

The algebraic definition \sum V^i W^i is therefore equivalent to the geometric definition |\mathbf V||\mathbf W|\cos(\theta) in any number of dimensions.

Exercise page 68

The relation is

V_n = g_{mn} V^m

We leave it to the reader to prove this.

The contravariant component of a vector are defined as the coefficient of the expansion into a linear combination of the basis vectors:

\mathbf V \equiv V^n \mathbf e_n

The covariant component of a vector are defined as:

V_n \equiv \mathbf V \cdot \mathbf e_n

Replacing the expansion in contravariant component (and changing the dummy index n \to m):

V_n = \mathbf V \cdot \mathbf e_n = V^m \mathbf e_m \cdot \mathbf e_n = V^m \left(\mathbf e_m \cdot \mathbf e_n\right)

By the definition of the metric tensor:

g_{mn} \equiv \mathbf e_m \cdot \mathbf e_n

we get:

V_n = V^m \left(\mathbf e_m \cdot \mathbf e_n\right) = g_{mn} V^m

Exercise page 78

It is easy to prove, and the reader is encouraged to do it, that if you take any tensor with a bunch of indices, any number of indices upstairs and downstairs,

{T^{nmr}}_{pqs} \tag{25}

and you contract a pair of them (one contravariant and one co-variant), say r and q, you get

{T^{nmr}}_{prs} \tag{26}

where the expression implicitly means a sum of components over r, and this is a new tensor.

Notice that the tensor of expression (25) has six indices, whereas the tensor of expression (26) has only four.

Let’s consider the tensor:

{T^{nmr}}_{pqs}

This tensor transform as:

\left({T^{nmr}}_{pqs} \right)^\prime = \frac{\partial Y^m}{\partial X^a} \frac{\partial Y^n}{\partial X^b} \frac{\partial Y^r}{\partial X^c} \frac{\partial X^d}{\partial Y^p} \frac{\partial X^e}{\partial Y^q} \frac{\partial X^f}{\partial Y^s} {T^{abc}}_{def}

If we now contract r and q we have:

\begin{aligned} \left({T^{nmr}}_{prs} \right)^\prime & = \frac{\partial Y^m}{\partial X^a} \frac{\partial Y^n}{\partial X^b} \frac{\partial Y^r}{\partial X^c} \frac{\partial X^d}{\partial Y^p} \frac{\partial X^e}{\partial Y^r} \frac{\partial X^f}{\partial Y^s} {T^{abc}}_{def} \\ & = \frac{\partial Y^m}{\partial X^a} \frac{\partial Y^n}{\partial X^b} \left(\frac{\partial Y^r}{\partial X^c} \frac{\partial X^e}{\partial Y^r} \right) \frac{\partial X^d}{\partial Y^p} \frac{\partial X^f}{\partial Y^s} {T^{abc}}_{def} \\ & = \frac{\partial Y^m}{\partial X^a} \frac{\partial Y^n}{\partial X^b} \left({\delta^c}_e\right) \frac{\partial X^d}{\partial Y^p} \frac{\partial X^f}{\partial Y^s} {T^{abc}}_{def} \\ & = \frac{\partial Y^m}{\partial X^a} \frac{\partial Y^n}{\partial X^b} \frac{\partial X^d}{\partial Y^p} \frac{\partial X^f}{\partial Y^s} {T^{abc}}_{dcf} \\ & = \frac{\partial Y^m}{\partial X^a} \frac{\partial Y^n}{\partial X^b} \frac{\partial X^d}{\partial Y^p} \frac{\partial X^f}{\partial Y^s} {T^{abr}}_{drf} \end{aligned}

Since the index is dummy we can change {\delta_c}_e = c \to r, so the original tensor with six indices has only four after the contraction.

Exercise page 93

It is easy to check that we are in the case where the 40 equations with 40 unknown do lead to an existing and unique solution. The reader is invited to verify it in two dimensions.

We want to find a coordinate system X such that at a specific point, which we can place at the origin X_0 = 0, the metric tensor is the Kronecker delta and its first derivatives vanish. This special coordinate system is known as a geodesic normal coordinate system.

Let us start with a general coordinate system Y and a metric g_{nr}(Y). We seek a transformation to a new system X(Y) where the new metric g^\prime_{mn}(X) satisfies:

g^\prime_{mn}(0) = \delta_{mn}

and:

\frac{\partial g^\prime_{mn}}{\partial X^k} = 0

We can express the transformation as a Taylor expansion around the origin. Assuming both coordinate systems share the same origin, X(0)= Y(0) = 0, the expansion begins at the linear term.

We can further simplify by aligning the axes at the origin, leading to the following form for the transformation up to the second order:

X^m = Y^m + \frac{1}{2} {C^m}_{nr} Y^n Y^r + \dots

The coefficients {C^m}_{nr} are constants that we need to determine. Due to the symmetry in the product Y^n Y^r, these coefficients are symmetric in their lower indices, {C^m}_{nr} = {C^m}_{rn}. For an N-dimensional space, there are N choices for the index m. The pair (n, r) has \frac{N(N+1)}{2} independent components. This gives a total of \frac{N^2(N+1)}{2} unknown coefficients to be determined.

Similarly, for the metric tensor, g_{nr} has \frac{N(N+1)}{2} independent components again for N components for the m index, giving \frac{N^3+N^2}{2} degree of freedom.

Since the number of unknown is the same as the number of degree of freedom, and the metric tensor is invertible, this system of equations will have a unique solution. This will allow to solve for the coefficient

Let’s see how this works in a two-dimensional space.

We are given the tensor equation that relates a general coordinate system Y to a specific one, the Gaussian normal coordinate system X:

X^m = Y^m + {C^m}_{nr} Y^nY^r

This equation describes the relationship between the coordinates in a small neighborhood around a point, which we’ll set as the origin for both systems (X=0, Y=0).

Y^m are the coordinates in a general, possibly curved, metric space. We want to understand the properties of this space near the origin.

X^m are the “Gaussian normal coordinates”. In this special coordinate system, space is locally flat at the origin.

{C^m}_{nr} are a set of coefficients that capture the second-order difference between the two coordinate systems.

The goal is to derive the explicit form of the 6 component equations in a 2D space (m, n, r = 1, 2) that arise from this relationship. These equations will express the six first-order derivatives of the metric tensor components g_{mn} in the Y system as functions of the six unique coefficients {C^m}_{nr}.

The given equation is a second-order approximation of the transformation from Y coordinates to X coordinates.

Let’s see this from a Taylor series expansion of the function X^m(Y^1, Y^2) around the origin Y=0:

X^m(Y) = X^m(0) + \frac{\partial X^m}{\partial Y^n}\bigg|_0 Y^n + \frac{1}{2} \frac{\partial^2 X^m}{\partial Y^n \partial Y^r}\bigg|_0 Y^n Y^r + \dots

We can make some simplifications for our setup:

  1. the origins of the two systems coincide: X^m(0) = 0;
  2. the axes of the two systems are aligned at the origin. This means the transformation matrix at the origin is the identity matrix: \frac{\partial X^m}{\partial Y^n}\big|_0 = {\delta^m}_n.

Applying these to the Taylor expansion gives:

X^m(Y) \approx {\delta^m}_n Y^n + \frac{1}{2} \frac{\partial^2 X^m}{\partial Y^n \partial Y^r}\bigg|_0 Y^n Y^r = Y^m + \frac{1}{2} \frac{\partial^2 X^m}{\partial Y^n \partial Y^r}\bigg|_0 Y^n Y^r

Comparing this to the given equation, X^m = Y^m + {C^m}_{nr} Y^nY^r, we can identify the coefficients {C^m}_{nr}:

{C^m}_{nr} = \frac{1}{2} \frac{\partial^2 X^m}{\partial Y^n \partial Y^r}\bigg|_0

This shows that the coefficients {C^m}_{nr} are half of the second partial derivatives of the transformation function, evaluated at the origin. Note that since partial derivatives commute (\partial_n \partial_r = \partial_r \partial_n), these coefficients are symmetric in their lower indices: {C^m}_{nr} = {C^m}_{rn}.

Now, let’s use the fundamental transformation law for a metric tensor. The metric g_{nr} in the Y system is related to the metric g^\prime_{mp} in the X system by:

g_{nr}(Y) = \frac{\partial X^m}{\partial Y^n} \frac{\partial X^p}{\partial Y^r} g^\prime_{mp}(X(Y))

Let’s first evaluate this at the origin Y=0:

g_{nr}(0) = \frac{\partial X^m}{\partial Y^n}\bigg|_0 \frac{\partial X^p}{\partial Y^r}\bigg|_0 g^\prime_{mp}(0) = {\delta^m}_n {\delta^p}_r \delta_{mp} = \delta_{nr}

This confirms that at the origin of the Y system, the metric is also the standard Euclidean metric.

Next, we differentiate the metric transformation law with respect to a coordinate Y^k. We apply the product rule for derivatives:

\begin{aligned} \frac{\partial g_{nr}}{\partial Y^k} = & \frac{\partial}{\partial Y^k} \left( \frac{\partial X^m}{\partial Y^n} \frac{\partial X^p}{\partial Y^r} g^\prime_{mp} \right) \\ \frac{\partial g_{nr}}{\partial Y^k} =& \left(\frac{\partial^2 X^m}{\partial Y^k \partial Y^n}\right) \frac{\partial X^p}{\partial Y^r} g^\prime_{mp} + \frac{\partial X^m}{\partial Y^n} \left(\frac{\partial^2 X^p}{\partial Y^k \partial Y^r}\right) g^\prime_{mp} \\ & + \frac{\partial X^m}{\partial Y^n} \frac{\partial X^p}{\partial Y^r} \left(\frac{\partial g^\prime_{mp}}{\partial Y^k}\right) \end{aligned}

To evaluate the last term, we use the chain rule:

\frac{\partial g^\prime_{mp}}{\partial Y^k} = \frac{\partial g^\prime_{mp}}{\partial X^c} \frac{\partial X^c}{\partial Y^k}

The full expression is:

\frac{\partial g_{nr}}{\partial Y^k} = \frac{\partial^2 X^m}{\partial Y^k \partial Y^n} \frac{\partial X^p}{\partial Y^r} g^\prime_{mp} + \frac{\partial X^m}{\partial Y^n} \frac{\partial^2 X^p}{\partial Y^k \partial Y^r} g^\prime_{mp} + \frac{\partial X^m}{\partial Y^n} \frac{\partial X^p}{\partial Y^r} \frac{\partial g^\prime_{mp}}{\partial X^c} \frac{\partial X^c}{\partial Y^k}

Now, we evaluate this expression at the origin Y=0, using our known conditions:

\begin{aligned} & \frac{\partial X^a}{\partial Y^b}\big|_0 = {\delta^a}_b \\ & g^\prime_{ab}(0) = \delta_{ab} \\ & \frac{\partial g^\prime_{ab}}{\partial X^c}\big|_0 = 0 \\ & \frac{\partial^2 X^m}{\partial Y^n \partial Y^r}\big|_0 = 2{C^m}_{nr} \end{aligned}

Substituting these values in:

\frac{\partial g_{nr}}{\partial Y^k}\bigg|_0 = (2{C^m}_{kn}) ({\delta^p}_r) (\delta_{mp}) + ({\delta^m}_n) (2{C^p}_{kr}) (\delta_{mp}) + ({\delta^m}_n) ({\delta^p}_r) (0) ({\delta^c}_k)

The last term vanishes completely. Let’s simplify the first two terms by carrying out the sums implied by the Kronecker deltas.

In the first term, \delta_{mp} changes the index m to p. The expression becomes (2{C^p}_{kn})({\delta^p}_r). Then, {\delta^p}_r changes the index p to r. The result is 2{C^r}_{kn}.

In the second term, \delta_{mp} changes the index m to p. The expression becomes ({\delta^p}_n)(2{C^p}_{kr}). Then, {\delta^p}_n changes the index p to n. The result is 2{C^n}_{kr}.

So, we arrive a the relationship:

\frac{\partial g_{nr}}{\partial Y^k}\bigg|_0 = 2{C^r}_{kn} + 2{C^n}_{kr}

Using the symmetry of the metric (g_{nr} = g_{rn}) and the coefficients ({C^n}_{kr} = {C_{rk}}^n), we can rewrite this by swapping indices to a more convenient form:

\frac{\partial g_{mn}}{\partial Y^r}\bigg|_0 = 2({C^n}_{mr} + {C^m}_{nr})

We are now ready to write the six explicit equations for a 2D space where indices can be 1 or 2.

The six known quantities are the coefficients from the coordinate transformation:

\begin{aligned} & \partial_1 g_{11}, \quad \partial_1 g_{12}, \quad \partial_1 g_{22} \\ & \partial_2 g_{11}, \quad \partial_2 g_{12}, \quad \partial_2 g_{22} \end{aligned}

The six unknowns are the first partial derivatives of the three unique metric components:

\begin{aligned} & {C^1}_{11}, \quad {C^1}_{12} (={C^1}_{21}), \quad {C^1}_{22} \\ & {C^2}_{11}, \quad {C^2}_{12} (={C^2}_{21}), \quad {C^2}_{22} \end{aligned}

We start with the system of six linear equations where the metric derivatives are known quantities and the {C^m}_{nr} coefficients are the unknowns we must find.

The system is:

\begin{aligned} & \frac{\partial g_{11}}{\partial Y^1} = 4{C^1}_{11} \\ & \frac{\partial g_{12}}{\partial Y^1} = 2({C^1}_{12} + {C^2}_{11}) \\ & \frac{\partial g_{22}}{\partial Y^1} = 4{C^2}_{12} \\ & \frac{\partial g_{11}}{\partial Y^2} = 4{C^1}_{12} \\ & \frac{\partial g_{12}}{\partial Y^2} = 2({C^1}_{22} + {C^2}_{12}) \\ & \frac{\partial g_{22}}{\partial Y^2} = 4{C^2}_{22} \end{aligned}

We can solve this system by simple algebraic manipulation.

From the first, third, fourth, and sixth equations, we can directly isolate four of the coefficients:

\begin{aligned} & {C^1}_{11} = \frac{1}{4} \frac{\partial g_{11}}{\partial Y^1} \\ & {C^2}_{22} = \frac{1}{4} \frac{\partial g_{22}}{\partial Y^2} \\ & {C^1}_{12} = \frac{1}{4} \frac{\partial g_{11}}{\partial Y^2} \\ & {C^2}_{12} = \frac{1}{4} \frac{\partial g_{22}}{\partial Y^1} \end{aligned}

Now we use these results to find the remaining two coefficients from the second and fifth equations.

To find {C^2}_{11}, we rearrange the second equation:

{C^2}_{11} = \frac{1}{2} \frac{\partial g_{12}}{\partial Y^1} - {C^1}_{12}

Substituting the expression for {C^1}_{12}:

{C^2}_{11} = \frac{1}{2} \frac{\partial g_{12}}{\partial Y^1} - \frac{1}{4} \frac{\partial g_{11}}{\partial Y^2}

To find {C^1}_{22}, we rearrange the fifth equation:

{C^1}_{22} = \frac{1}{2} \frac{\partial g_{12}}{\partial Y^2} - {C^2}_{12}

Substituting the expression for {C^2}_{12}:

{C^1}_{22} = \frac{1}{2} \frac{\partial g_{12}}{\partial Y^2} - \frac{1}{4} \frac{\partial g_{22}}{\partial Y^1}

The six coefficients {C^m}_{nr} are uniquely determined by the first partial derivatives of the metric tensor components in the original coordinate system, evaluated at the origin.

\begin{aligned} & {C^1}_{11} = \frac{1}{4} \frac{\partial g_{11}}{\partial Y^1} \\ & {C^1}_{12} = \frac{1}{4} \frac{\partial g_{11}}{\partial Y^2} \\ & {C^1}_{22} = \frac{1}{2} \frac{\partial g_{12}}{\partial Y^2} - \frac{1}{4} \frac{\partial g_{22}}{\partial Y^1} \\ & {C^2}_{11} = \frac{1}{2} \frac{\partial g_{12}}{\partial Y^1} - \frac{1}{4} \frac{\partial g_{11}}{\partial Y^2} \\ & {C^2}_{12} = \frac{1}{4} \frac{\partial g_{22}}{\partial Y^1} \\ & {C^2}_{22} = \frac{1}{4} \frac{\partial g_{22}}{\partial Y^2} \end{aligned}

Exercise page 102

It follows from the definition of the covariant differentiation - namely, to differentiate a vector V at a point P, go to a set Gaussian normal coordinates at P, differentiate the vector in the ordinary manner, treat the object you obtain as a tensor with two indices, change coordinates, etc. - that the Christoffel symbols have a symmetry:

{\Gamma^t}_{rm} = {\Gamma^t}_{mr}

The covariant derivative of a covector, D_r V_m, transforms as a tensor of rank (0,2).

Let X^a be the coordinates in the “old” system (unprimed) and Y^r be the coordinates in the “new” system (primed). The transformation rule is:

\left(D_r V_m\right)^\prime = \frac{\partial X^a}{\partial Y^r}\frac{\partial X^b}{\partial Y^m} D_a V_b

We will now expand the left hand side and the right hand side of this equation separately and then set them equal.

The left hand side is the definition of the covariant derivative in the new (primed) coordinate system:

\left(D_r\right)^\prime \left(V_m\right)^\prime = \frac{\partial \left(V_m\right)^\prime}{\partial Y^r} - \left({\Gamma^t}_{rm} \right)^\prime \left(V_t\right)^\prime

We must express the primed vector components \left(V_m\right)^\prime and \left(V_t\right)^\prime in terms of the unprimed components V_b and V_c. The transformation rule for a covector is:

\begin{aligned} & \left(V_m\right)^\prime = \frac{\partial X^b}{\partial Y^m} V_b\\ & \quad V'_t = \frac{\partial X^c}{\partial Y^t} V_c \end{aligned}

Substitute these into the left hand side expression:

\frac{\partial}{\partial Y^r} \left( \frac{\partial X^b}{\partial Y^m} V_b \right) - \left({\Gamma^t}_{rm}\right)^\prime \left( \frac{\partial X^c}{\partial Y^t} V_c \right)

Now, apply the product rule to the first term:

\left( \frac{\partial^2 X^b}{\partial Y^r \partial Y^m} \right) V_b + \frac{\partial X^b}{\partial Y^m} \frac{\partial V_b}{\partial Y^r} - \left({\Gamma^t}_{rm}\right)^\prime \frac{\partial X^c}{\partial Y^t} V_c

We apply the chain rule on the \frac{\partial V_b}{\partial Y^r} term to get it in terms of the old coordinates:

\frac{\partial V_b}{\partial Y^r} = \frac{\partial V_b}{\partial X^a} \frac{\partial X^a}{\partial Y^r} = (\partial_a V_b) \frac{\partial X^a}{\partial Y^r}

Substituting:

\left( \frac{\partial^2 X^b}{\partial Y^r \partial Y^m} \right) V_b + \frac{\partial X^b}{\partial Y^m} \frac{\partial X^a}{\partial Y^r} (\partial_a V_b) - \left({\Gamma^t}_{rm}\right)^\prime \frac{\partial X^c}{\partial Y^t} V_c

The right hand side is the standard tensor transformation rule:

\frac{\partial X^a}{\partial Y^r}\frac{\partial X^b}{\partial Y^m} D_a V_b

Now, we expand the covariant derivative D_a V_b using its definition in the old (unprimed) system, (D_a V_b = \partial_a V_b - {\Gamma^c}_{ab} V_c):

\frac{\partial X^a}{\partial Y^r}\frac{\partial X^b}{\partial Y^m} \left( \partial_a V_b - {\Gamma^c}_{ab} V_c \right) = \frac{\partial X^a}{\partial Y^r}\frac{\partial X^b}{\partial Y^m} (\partial_a V_b) - \frac{\partial X^a}{\partial Y^r}\frac{\partial X^b}{\partial Y^m} {\Gamma^c}_{ab} V_c

Now we set the final expressions equal to each other:

\left( \frac{\partial^2 X^b}{\partial Y^r \partial Y^m} \right) V_b + \frac{\partial X^a}{\partial Y^r} \frac{\partial X^b}{\partial Y^m} (\partial_a V_b) - \left({\Gamma^t}_{rm}\right)^\prime \frac{\partial X^c}{\partial Y^t} V_c = \frac{\partial X^a}{\partial Y^r}\frac{\partial X^b}{\partial Y^m} (\partial_a V_b) - \frac{\partial X^a}{\partial Y^r}\frac{\partial X^b}{\partial Y^m} {\Gamma^c}_{ab} V_c

The term \frac{\partial X^a}{\partial Y^r} \frac{\partial X^b}{\partial Y^m} (\partial_a V_b) appears on both sides and cancels out:

\left( \frac{\partial^2 X^b}{\partial Y^r \partial Y^m} \right) V_b - \left({\Gamma^t}_{rm}\right)^\prime \frac{\partial X^c}{\partial Y^t} V_c = - \frac{\partial X^a}{\partial Y^r}\frac{\partial X^b}{\partial Y^m} {\Gamma^c}_{ab} V_c

To simplify, let’s make the free index on the vector component the same in all terms. We can rename the dummy index b to c in the first term:

\left( \frac{\partial^2 X^c}{\partial Y^r \partial Y^m} \right) V_c - \left({\Gamma^t}_{rm}\right)^\prime \frac{\partial X^c}{\partial Y^t} V_c = - \frac{\partial X^a}{\partial Y^r}\frac{\partial X^b}{\partial Y^m} {\Gamma^c}_{ab} V_c

Since this equation must hold for any covector field V_c, the coefficients of V_c on both sides must be equal:

\frac{\partial^2 X^c}{\partial Y^r \partial Y^m} - \left({\Gamma^t}_{rm}\right)^\prime \frac{\partial X^c}{\partial Y^t} = - \frac{\partial X^a}{\partial Y^r}\frac{\partial X^b}{\partial Y^m} {\Gamma^c}_{ab}

Rearranging to isolate the term with the new Christoffel symbol \left({\Gamma^t}_{rm}\right)^\prime:

\left({\Gamma^t}_{rm}\right)^\prime \frac{\partial X^c}{\partial Y^t} = \frac{\partial X^a}{\partial Y^r}\frac{\partial X^b}{\partial Y^m} {\Gamma^c}_{ab} + \frac{\partial^2 X^c}{\partial Y^r \partial Y^m}

Finally, to solve for \left({\Gamma^t}_{rm}\right)^\prime, we multiply the entire equation by the inverse Jacobian matrix element \frac{\partial Y^k}{\partial X^c}:

\left({\Gamma^t}_{rm}\right)^\prime = \left( \frac{\partial X^a}{\partial Y^r}\frac{\partial X^b}{\partial Y^m} {\Gamma^c}_{ab} + \frac{\partial^2 X^c}{\partial Y^r \partial Y^m} \right) \frac{\partial Y^k}{\partial X^c}

This is the transformation law for the Christoffel symbols. When we swap the indices r and m in our final expression:

\left({\Gamma^t}_{rm}\right)^\prime = \left( \frac{\partial X^a}{\partial Y^m}\frac{\partial X^b}{\partial Y^r} {\Gamma^c}_{ab} + \frac{\partial^2 X^c}{\partial Y^m \partial Y^r} \right) \frac{\partial Y^k}{\partial X^c}

Let’s analyze the two terms inside the parenthesis.

For second derivative term, by the equality of mixed partials, the order of differentiation does not matter:

\frac{\partial^2 X^c}{\partial Y^m \partial Y^r} = \frac{\partial^2 X^c}{\partial Y^r \partial Y^m}

So this part is symmetric in r and m.

For the original Christoffel symbol term, let’s look at the first term from the expression for \left({\Gamma^t}_{rm}\right)^\prime:

\frac{\partial X^a}{\partial Y^m}\frac{\partial X^b}{\partial Y^r} {\Gamma^c}_{ab}

Since a and b are dummy indices (they are summed over), we can swap their labels:

\frac{\partial X^b}{\partial Y^m}\frac{\partial X^a}{\partial Y^r} {\Gamma^c}_{ba}

Now, with the assumption that the Christoffel symbols are symmetric in the old coordinate system (i.e., {\Gamma^c}_{ba} = {\Gamma^c}_{ab}), this becomes:

\frac{\partial X^b}{\partial Y^m}\frac{\partial X^a}{\partial Y^r} {\Gamma^c}_{ab}

This is identical to the first term in the expression for \left({\Gamma^t}_{rm}\right)^\prime.

Because both parts of the expression for \left({\Gamma^t}_{rm}\right)^\prime are symmetric in the indices r and m (provided {\Gamma^c}_{ab} is symmetric), it follows that the result must also be symmetric:

\left({\Gamma^t}_{rm}\right)^\prime = \left({\Gamma^t}_{mr}\right)^\prime

Exercise 3.1

Explain why the space can be flat and nevertheless the Christoffel symbols not zero.

The Christoffel symbols are defined as:

{\Gamma^t}_{mn} = \frac{1}{2}g^{rt}\left( \partial_n g_{rm} + \partial_m g_{rn} - \partial_r g_{mn} \right)

They express the rate of change of the basis vectors as one moves through a coordinate system. For example, in a two-dimensional Euclidean space, we can use different coordinate systems to describe the same flat geometry.

A space is considered “flat” if its Riemann curvature tensor is zero everywhere. The Riemann tensor is constructed from the Christoffel symbols and their derivatives, and it is a true tensor. This means if it is zero in one coordinate system, it is zero in all coordinate systems.

However, the Christoffel symbols themselves are not tensors. This means their values depend on the coordinate system used.

Let’s consider for example a two dimensional flat Euclidean space.

If we represent this space in Cartesian coordinates, the metric tensor components are constant (g_{xx} = 1, g_{yy} = 1, g_{xy} = 0). Since the derivatives of the metric are zero, all the Christoffel symbols are zero.

We can also describe the same flat plane using polar coordinates. The metric tensor in polar coordinates is not constant (g_{rr} = 1, g_{\theta\theta} = r^2, g_{r\theta} = 0). Because the component g_{\theta\theta} depends on r, its partial derivatives with respect to r are non-zero. This results in some of the Christoffel symbols being non-zero. For instance, {\Gamma^r}_{\theta\theta} = -r and {\Gamma^\theta}_{r\theta} = 1/r.

Flat space means that a coordinate system exists where the Christoffel symbols are zero (like Cartesian coordinates). But if we choose a different, curvilinear coordinate system (like polar coordinates) to describe that same flat space, the Christoffel symbols can be non-zero because the basis vectors of that coordinate system change from point to point. These non-zero Christoffel symbols reflect the “curvature” of the coordinate lines, not an intrinsic curvature of the space itself.

Exercise 3.2

Explain why the covariant derivative of the metric tensor is always zero.

The formula for the covariant derivative of a rank-two tensor, T_{mn}, is:

D_r T_{mn} \equiv \partial_r T_{mn} - {\Gamma^t}_{rm} T_{tn} - {\Gamma^t}_{rn} T_{mt}

To show that the covariant derivative of the metric tensor g_{mn} is always zero, we can evaluate this expression in any coordinate system.

At any arbitrary point P, we can establish a locally inertial reference frame, the Gaussian normal coordinates. For this coordinate system at point P is that all the Christoffel symbols vanish:

{\Gamma^t}_{mn} = 0

Furthermore, at this point P in these coordinates, the metric tensor is equivalent to the Kronecker delta, and its partial derivatives are zero:

\begin{aligned} & g_{mn} = \delta_{mn} \\ & \frac{\partial g_{mn}}{\partial X^r} = 0 \end{aligned}

Substituting these conditions into the general formula for the covariant derivative at point P, we get:

D_r g_{mn} = 0

This shows that the covariant derivative of the metric tensor is zero at point P in Gaussian normal coordinates. However, the covariant derivative of a tensor is itself a tensor and if all their components are zero in one coordinate system, they are zero in every coordinate system.

Since the choice of point P was arbitrary, this result holds for all points. Therefore, the covariant derivative of the metric tensor is always zero.

Exercise 3.3

On Earth, with the polar coordinates \theta for latitude and \phi for longitude, find

  1. the metric tensor g_{mn}
  2. its inverse g^{mn}
  3. the Christoffel symbols at point (\theta, \phi)

A point P in space is expressed in Cartesian coordinates using spherical coordinates (r, \theta, \phi) with \theta as latitude:

\begin{aligned} x &= r \cos(\theta) \cos(\phi) \\ y &= r \cos(\theta) \sin(\phi) \\ z &= r \sin(\theta) \end{aligned}

The Jacobian matrix \mathbf{J} transforms an infinitesimal displacement in spherical coordinates, \mathrm d\mathbf{Y} = [\mathrm dr, \mathrm d\theta, \mathrm d\phi]^T, to Cartesian coordinates, \mathrm d\mathbf{X} = [\mathrm dx, \mathrm dy, \mathrm dz]^T.

The components of the Jacobian are J^i_{\ j} = \dfrac{\partial X^i}{\partial Y^j}. Calculating the partial derivatives:

\mathbf{J} = \begin{bmatrix} \dfrac{\partial x}{\partial r} & \dfrac{\partial x}{\partial \theta} & \dfrac{\partial x}{\partial \phi} \\[10px] \dfrac{\partial y}{\partial r} & \dfrac{\partial y}{\partial \theta} & \dfrac{\partial y}{\partial \phi} \\[10px] \dfrac{\partial z}{\partial r} & \dfrac{\partial z}{\partial \theta} & \dfrac{\partial z}{\partial \phi} \end{bmatrix} = \begin{bmatrix} \cos(\theta) \cos(\phi) & -r \sin(\theta) \cos(\phi) & -r \cos(\theta) \sin(\phi) \\ \cos(\theta) \sin(\phi) & -r \sin(\theta) \sin(\phi) & r \cos(\theta) \cos(\phi) \\ \sin(\theta) & r \cos(\theta) & 0 \end{bmatrix}

The infinitesimal displacements are given by the relation \mathrm d\mathbf{X} = \mathbf{J} \, \mathrm d\mathbf{Y}, which expands to:

\begin{aligned} \mathrm dx &= \left(\cos(\theta) \cos(\phi)\right) \mathrm dr - \left(r \sin(\theta) \cos(\phi)\right) \mathrm d\theta - \left(r \cos(\theta) \sin(\phi)\right) \mathrm d\phi \\ \mathrm dy &= \left(\cos(\theta) \sin(\phi)\right) \mathrm dr - \left(r \sin(\theta) \sin(\phi)\right) \mathrm d\theta + \left(r \cos(\theta) \cos(\phi)\right) \mathrm d\phi \\ \mathrm dz &= \left(\sin(\theta)\right) \mathrm dr + \left(r \cos(\theta)\right) \mathrm d\theta \end{aligned}

To find the line element on the surface of the Earth, we impose the constraint that the radius is constant. We set r=R and \mathrm dr=0.

The expressions for the displacements simplify to:

\begin{aligned}\mathrm dx &= -R \sin(\theta) \cos(\phi) \, \mathrm d\theta - R \cos(\theta) \sin(\phi) \,\mathrm d\phi \\ \mathrm dy &= -R \sin(\theta) \sin(\phi) \, \mathrm d\theta + R \cos(\theta) \cos(\phi) \, \mathrm d\phi \\ \mathrm dz &= R \cos(\theta) \, \mathrm d\theta \end{aligned}

The square of the infinitesimal line element, \mathrm ds^2, is \mathrm dx^2 + \mathrm dy^2 + \mathrm dz^2. Performing the calculation:

\begin{aligned} \mathrm ds^2 &= \left( -R \sin(\theta) \cos(\phi) \, \mathrm d\theta - R \cos(\theta) \sin(\phi) \, \mathrm d\phi \right)^2 + \left( -R \sin(\theta) \sin(\phi) \, \mathrm d\theta + R \cos(\theta) \cos(\phi) \, \mathrm d\phi \right)^2 + \left( R \cos(\theta) \, \mathrm d\theta \right)^2 \\ &= R^2 \mathrm d\theta^2 (\sin^2 (\theta)\cos^2 (\phi) + \sin^2 (\theta)\sin^2\phi + \cos^2 (\theta)) + R^2 \mathrm d\phi^2(\cos^2 (\theta)\sin^2\phi + \cos^2 (\theta)\cos^2 (\phi)) \\ &= R^2 \mathrm d\theta^2 (\sin^2 (\theta) + \cos^2 (\theta)) + R^2 \mathrm d\phi^2(\cos^2 (\theta)) \\ &= R^2 \mathrm \, d\theta^2 + R^2 \cos^2(\theta) \, \mathrm d\phi^2 \end{aligned}

The final expression for the line element is:

\mathrm ds^2 = R^2 \mathrm d\theta^2 + R^2 \cos^2(\theta) \mathrm \, d\phi^2

Metric tensor

The metric tensor g_{mn} is defined by the line element \mathrm ds^2 according to the formula:

\mathrm ds^2 = g_{mn} \mathrm dY^m \mathrm dY^n

where the coordinates are Y^1 = \theta and Y^2 = \phi.

By comparing the formula to the given line element:

\mathrm ds^2 = (R^2) \,\mathrm d\theta^2 + (R^2 \cos^2(\theta)) \,\mathrm d\phi^2

we can identify the components of the metric tensor by matching the coefficients.

The coefficient of \mathrm d\theta^2 is g_{11}:

g_{11} = R^2

The coefficient of \mathrm d\phi^2 is g_{22}:

g_{22} = R^2 \cos^2(\theta)

The coefficient of the cross-term \mathrm d\theta \mathrm d\phi is g_{12} + g_{21}. Since this term is absent, the off-diagonal components are zero:

g_{12} = g_{21} = 0

Therefore, the metric tensor g_{mn} in matrix form is:

g_{mn} = \begin{bmatrix} R^2 & 0 \\ 0 & R^2 \cos^2(\theta) \end{bmatrix}

Inverse metric tensor

The inverse metric tensor g^{mn} is defined as the matrix inverse of the metric tensor g_{mn}. It satisfies the relation:

g^{mk} g_{kn} = {\delta^m}_n

We start with the computed metric tensor:

g_{mn} = \begin{bmatrix} R^2 & 0 \\ 0 & R^2 \cos^2(\theta) \end{bmatrix}

Since this is a diagonal matrix, its inverse is found by taking the reciprocal of each element on the main diagonal:

g^{mn} = \begin{bmatrix} \dfrac{1}{R^2} & 0 \\ 0 & \dfrac{1}{R^2 \cos^2(\theta)} \end{bmatrix}

Christoffel symbols

The Christoffel symbols are given by the formula:

{\Gamma^t}_{mn} = \frac{1}{2} g^{rt} \left[ \partial_n g_{rm} + \partial_m g_{rn} - \partial_r g_{mn} \right]

Here, the indices t, m, n, r range over our coordinates, \{\theta, \phi\}.

\Gamma^1_{22}

t=1, m=2, n=2. The summation index is r:

{\Gamma^1}_{22} = \frac{1}{2} g^{r1} \left[ \partial_\phi g_{r2} + \partial_\phi g_{r2} - \partial_r g_{22} \right]

Since g^{r1} is non-zero only for r=1, the sum collapses to a single term:

\begin{aligned} {\Gamma^1}_{22} & = \frac{1}{2} g^{11} \left[ \partial_\phi g_{12} + \partial_\phi g_{12} - \partial_\theta g_{22} \right] \\ & = \frac{1}{2} g^{11} \left[ - \partial_\theta g_{22} \right] = \frac{1}{2} \left( \frac{1}{R^2} \right) \left( -(-2R^2 \sin(\theta) \cos(\theta)) \right) \\ & = \sin(\theta) \cos(\theta) \end{aligned}

as g_{12} = \partial_\phi g_{12} = 0.

\Gamma^2_{12}

t=2, m=1, n=2. The summation index is r:

{\Gamma^2}_{12} = \frac{1}{2} g^{r2} \left[ \partial_\phi g_{r1} + \partial_\theta g_{r2} - \partial_r g_{12} \right]

Since g^{r2} is non-zero only for r=2, the sum collapses to a single term:

\begin{aligned} {\Gamma^2}_{12} & = \frac{1}{2} g^{22} \left[ \partial_\phi g_{21} + \partial_\theta g_{22} - \partial_\phi g_{12} \right] \\ & = \frac{1}{2} g^{22} \left[ \partial_\theta g_{22} \right] \\ & = \frac{1}{2} \left( \frac{1}{R^2 \cos^2(\theta)} \right) \left( -2R^2 \sin(\theta) \cos(\theta) \right) \\ & = -\frac{\sin(\theta)}{\cos(\theta)} = -\tan\theta \end{aligned}

Since g_{21} = g_{12} = \partial_\phi g_{21} = \partial_\phi g_{12} are zero. By symmetry of the lower indices in the formula, {\Gamma^2}_{21} = {\Gamma^2}_{12}.

Summary

The non-zero Christoffel symbols at a point (\theta, \phi) on the sphere are:

\begin{aligned} & {\Gamma^1}_{22} = \sin(\theta) \cos(\theta) \\ & {\Gamma^2}_{12} = {\Gamma^2}_{21} = -\tan\theta \end{aligned}

All other symbols are zero:

{\Gamma^1}_{11} = {\Gamma^1}_{12} = {\Gamma^1}_{21} = {\Gamma^2}_{11} = {\Gamma^2}_{22} = 0

Exercise page 116

Let’s replace the covariant derivative of V_n, with respect to r, by its expression given in equation (12). We get

D_s D_r V_n = D_s \left[ \partial_r V_n - {\Gamma^t}_{rn} V_t \right]

Notice that \left[ \partial_r V_n - {\Gamma^t}_{rn} V_t \right] is a tensor. We know how to differentiate it: use equation (13). Continue to crank mechanically the calculations.

Equation (13) gives the formula for deriving a rank two tensor:

D_r T_{mn} = \partial_r T_{mn} - {\Gamma^t}_{rm} T_{tn} - {\Gamma^t}_{rn} T_{mt}

To align with the required expression, we rename r with s, m with r and t with p:

D_s T_{rn} = \partial_s T_{rn} - {\Gamma^p}_{sr} T_{pn} - {\Gamma^p}_{sn} T_{rp}

Let’s apply this formula to \left[ \partial_r V_n - {\Gamma^t}_{rn} V_t \right]:

\begin{aligned} D_s D_r V_n = & D_s \left[ \partial_r V_n - {\Gamma^t}_{rn} V_t \right] \\ = & \partial_s \left[ \partial_r V_n - {\Gamma^t}_{rn} V_t \right] - {\Gamma^p}_{sr} \left[ \partial_p V_n - {\Gamma^t}_{pn} V_t \right] - {\Gamma^p}_{sn} \left[ \partial_r V_p - {\Gamma^t}_{rp} V_t \right] \\ = & \partial_s \partial_r V_n - \left(\partial_s {\Gamma^t}_{rn} \right) V_t - {\Gamma^t}_{rn} \partial_s V_t - {\Gamma^p}_{sr} \left[ \partial_p V_n - {\Gamma^t}_{pn} V_t \right] - {\Gamma^p}_{sn} \left[ \partial_r V_p - {\Gamma^t}_{rp} V_t \right] \end{aligned}

We can now do in the opposite order for D_s V_n = \left[ \partial_s V_n - {\Gamma^t}_{sn} V_t \right].

We rename m with s and t with p:

D_r T_{sn} = \partial_r T_{sn} - {\Gamma^p}_{rs} T_{pn} - {\Gamma^p}_{rn} T_{sp}

Applying this formula:

\begin{aligned} D_r D_s V_n = & D_r \left[ \partial_s V_n - {\Gamma^t}_{sn} V_t \right] \\ = & \partial_r \left[ \partial_s V_n - {\Gamma^t}_{sn} V_t \right] - {\Gamma^p}_{rs} \left[ \partial_p V_n - {\Gamma^t}_{pn} V_t \right] - {\Gamma^p}_{rn} \left[ \partial_s V_p - {\Gamma^t}_{sp} V_t \right] \\ = & \partial_r \partial_s V_n - \left(\partial_r{\Gamma^t}_{sn}\right)V_t - {\Gamma^t}_{sn}\partial_r V_t - {\Gamma^p}_{rs} \left[ \partial_p V_n - {\Gamma^t}_{pn} V_t \right] - {\Gamma^p}_{rn} \left[ \partial_s V_p - {\Gamma^t}_{sp} V_t \right] \end{aligned}

We can now compute the commutator:

\begin{aligned} [D_s D_r, D_s D_r]V_n = & D_s D_r V_n - D_r D_s V_n \\ = &D_s \left[ \partial_r V_n - {\Gamma^t}_{rn} V_t \right] - \left\{ D_r \left[ \partial_s V_n - {\Gamma^t}_{sn} V_t \right] \right\}\\ = & \left\{ \partial_s \partial_r V_n - \left(\partial_s {\Gamma^t}_{rn} \right) V_t - {\Gamma^t}_{rn} \partial_s V_t - {\Gamma^p}_{sr} \left[ \partial_p V_n - {\Gamma^t}_{pn} V_t \right] - {\Gamma^p}_{sn} \left[ \partial_r V_p - {\Gamma^t}_{rp} V_t \right] \right\} \\ & - \left\{ \partial_r \partial_s V_n - \left(\partial_r{\Gamma^t}_{sn}\right)V_t - {\Gamma^t}_{sn}\partial_r V_t - {\Gamma^p}_{rs} \left[ \partial_p V_n - {\Gamma^t}_{pn} V_t \right] - {\Gamma^p}_{rn} \left[ \partial_s V_p - {\Gamma^t}_{sp} V_t \right]\right\} \\ = & \left\{ \boxed{\partial_s \partial_r V_n} - \left(\partial_s {\Gamma^t}_{rn} \right) V_t - {\Gamma^t}_{rn} \partial_s V_t - {\Gamma^p}_{sr} \left[ \partial_p V_n - {\Gamma^t}_{pn} V_t \right] - {\Gamma^p}_{sn} \left[ \partial_r V_p - {\Gamma^t}_{rp} V_t \right] \right\} \\ & - \left\{ \boxed{\partial_r \partial_s V_n} - \left(\partial_r{\Gamma^t}_{sn}\right)V_t - {\Gamma^t}_{sn}\partial_r V_t - {\Gamma^p}_{rs} \left[ \partial_p V_n - {\Gamma^t}_{pn} V_t \right] - {\Gamma^p}_{rn} \left[ \partial_s V_p - {\Gamma^t}_{sp} V_t \right]\right\} \\ = & \left\{ - \left(\partial_s {\Gamma^t}_{rn} \right) V_t - {\Gamma^t}_{rn} \partial_s V_t \boxed{- {\Gamma^p}_{sr} \left[ \partial_p V_n - {\Gamma^t}_{pn} V_t \right]} - {\Gamma^p}_{sn} \left[ \partial_r V_p - {\Gamma^t}_{rp} V_t \right] \right\} \\ & - \left\{ - \left(\partial_r{\Gamma^t}_{sn}\right)V_t - {\Gamma^t}_{sn}\partial_r V_t \boxed{- {\Gamma^p}_{rs} \left[ \partial_p V_n - {\Gamma^t}_{pn} V_t \right] } - {\Gamma^p}_{rn} \left[ \partial_s V_p - {\Gamma^t}_{sp} V_t \right]\right\} \\ = & \left\{ - \left(\partial_s {\Gamma^t}_{rn} \right) V_t - {\Gamma^t}_{rn} \partial_s V_t - {\Gamma^p}_{sn} \partial_r V_p + {\Gamma^p}_{sn}{\Gamma^t}_{rp} V_t \right\} \\ & - \left\{ - \left(\partial_r{\Gamma^t}_{sn}\right)V_t - {\Gamma^t}_{sn}\partial_r V_t - {\Gamma^p}_{rn} \partial_s V_p + {\Gamma^p}_{rn} {\Gamma^t}_{sp} V_t \right\} \\ = & \left\{ \left(\partial_r{\Gamma^t}_{sn}\right)V_t - \left(\partial_s {\Gamma^t}_{rn} \right) V_t + {\Gamma^p}_{sn}{\Gamma^t}_{rp} V_t - {\Gamma^p}_{rn} {\Gamma^t}_{sp} V_t \right\} \\ & + \left\{ - {\Gamma^t}_{rn} \partial_s V_t - {\Gamma^p}_{sn} \partial_r V_p + {\Gamma^t}_{sn}\partial_r V_t + {\Gamma^p}_{rn} \partial_s V_p \right\} \end{aligned}

In the last curly brackets the indices are dummy and therefore we can rename p to t:

- {\Gamma^t}_{rn} \partial_s V_t - {\Gamma^p}_{sn} \partial_r V_p + {\Gamma^t}_{sn}\partial_r V_t + {\Gamma^p}_{rn} \partial_s V_p = - {\Gamma^t}_{rn} \partial_s V_t - {\Gamma^t}_{sn} \partial_r V_t + {\Gamma^t}_{sn}\partial_r V_t + {\Gamma^t}_{rn} \partial_s V_t = 0

which leaves, factoring V_t and using the symmetry of \Gamma:

[D_s D_r, D_s D_r]V_n = \left[\partial_r{\Gamma^t}_{sn} - \partial_s {\Gamma^t}_{rn} + {\Gamma^p}_{sn}{\Gamma^t}_{pr} - {\Gamma^p}_{rn} {\Gamma^t}_{ps}\right]V_t

The tensor:

{R^t}_{srn} \equiv \partial_r{\Gamma^t}_{sn} - \partial_s {\Gamma^t}_{rn} + {\Gamma^p}_{sn}{\Gamma^t}_{pr} - {\Gamma^p}_{rn} {\Gamma^t}_{ps}

is the curvature tensor.

Go to the top of the page