Given the metric g_{\mu\nu}(X), show that the Euler-Lagrange equation (16) (we drop the “s”), to minimize the action along a trajectory in space-time,
\frac{d}{dt} \frac{\partial \mathcal L}{\partial \dot X^m} = \frac{\partial \mathcal L}{\partial X^m}
where the Lagrangian \mathcal L is
\mathcal L = -m \sqrt{- g_{\mu\nu}(X) \frac{dX^\mu}{dt} \frac{dX^\nu}{dt}}
is equivalent to the definition of a geodesic given by equation (6), which says that the tangent vector to the trajectory in space-time stays constant:
\frac{d^2 X^\mu}{d\tau^2} = - {\Gamma^\mu}_{\sigma\rho}\frac{\mathrm dX^\sigma}{\mathrm d\tau}\frac{dX^\rho}{d\tau}
The Lagrangian is given (we simplify writing g_{\mu\nu}(X) as g_{\mu\nu}):
\mathcal L = -m \sqrt{- g_{\mu\nu} \frac{\mathrm dX^\mu}{\mathrm dt} \frac{\mathrm dX^\nu}{\mathrm dt}} \equiv -m \sqrt{F}
The partial derivative with respect to velocity \dot{X}^m is:
\begin{aligned} \frac{\partial \mathcal L}{\partial \dot X^m} &=\frac{\partial}{\partial \dot X^m}\left(-m\sqrt F \right) = \left(\frac{-m}{2 \sqrt F}\right)\frac{\partial F}{\partial \dot X^m} \\ & = -\frac{m}{2 \sqrt F}\frac{\partial }{\partial \dot X^m} \left(- g_{\mu\nu} \frac{\mathrm dX^\mu}{\mathrm dt} \frac{\mathrm dX^\nu}{\mathrm dt}\right) \\ & = -\frac{m}{2 \sqrt F}\frac{\partial }{\partial \dot X^m} \left(- g_{\mu\nu} \dot X^\mu \dot X^\nu\right) \\ & = -\frac{m}{2 \sqrt F} (- g_{\mu\nu}) \left(\frac{\partial \dot X^\mu}{\partial \dot X^m} \dot X^\nu + \frac{\partial \dot X^\nu}{\partial \dot X^m} \dot X^\mu \right) \\ & = \frac{m}{2 \sqrt F}g_{\mu\nu} \left({\delta^\mu}_m \dot X^\nu + {\delta^\nu}_m \dot X^\mu \right) \\ & = \frac{m}{2 \sqrt F}g_{\mu\nu} \left({\delta^\mu}_m \dot X^\nu + {\delta^\nu}_m \dot X^\mu \right) \\ & = \frac{m}{2 \sqrt F}\left(g_{m\nu} \dot X^\nu + g_{\mu m} \dot X^\mu \right) \\ & = \frac{m}{2 \sqrt F}\left(2 g_{m\nu} \dot X^\nu \right) \\ & = \frac{m g_{m\nu}}{\sqrt F} \frac{\mathrm dX^\nu}{\mathrm dt} \end{aligned}
From:
\mathrm d\tau^2 = - g_{\mu\nu} \mathrm dX^\mu \mathrm dX^\nu
We have:
\frac{\mathrm d\tau}{\mathrm dt} = \sqrt {\frac{\mathrm d\tau^2}{\mathrm dt^2}} = \sqrt{- g_{\mu\nu} \frac{\mathrm dX^\mu}{\mathrm dt} \frac{\mathrm dX^\nu}{\mathrm dt}} = \sqrt F
Substituting:
\frac{\partial \mathcal L}{\partial \dot X^m} = \frac{m g_{m\nu}}{\sqrt F} \frac{\mathrm dX^\nu}{\mathrm dt} = m g_{m\nu}\frac{\frac{\mathrm dX^\nu}{\mathrm dt}}{\frac{\mathrm d\tau}{\mathrm dt}} = m g_{m\nu}\frac{\mathrm dX^\nu}{\mathrm d\tau}
Now we can compute the full left-hand side of the Euler-Lagrange equation by taking the total derivative with respect to t:
\begin{aligned} \frac{\mathrm d}{\mathrm dt} \left( \frac{\partial \mathcal L}{\partial \dot X^m} \right) &= m \frac{\mathrm d}{\mathrm dt} \left( g_{m\nu} \frac{\mathrm dX^\nu}{\mathrm d\tau} \right) \\ &= m \left[ \left( \frac{\mathrm d g_{m\nu}}{\mathrm dt} \right) \frac{\mathrm dX^\nu}{\mathrm d\tau} + g_{m\nu} \frac{\mathrm d}{\mathrm dt} \left( \frac{\mathrm dX^\nu}{\mathrm d\tau} \right) \right] \end{aligned}
We can expand these total derivatives using the chain rule, noting that g_{m\nu} depends on position X^\sigma, which depends on t:
\frac{\mathrm d}{\mathrm dt} \left( \frac{\partial \mathcal L}{\partial \dot X^m} \right) = m \left(\partial_\sigma g_{m\nu} \frac{\mathrm dX^\sigma}{\mathrm dt} \frac{\mathrm dX^\nu}{\mathrm d\tau} + g_{m\nu} \frac{\mathrm d^2 X^\nu}{\mathrm d\tau^2} \frac{\mathrm d\tau}{\mathrm dt} \right)
Next, we compute the right-hand side, \frac{\partial \mathcal L}{\partial X^m}. The dependence on X^m is entirely within the metric tensor:
\begin{aligned} \frac{\partial \mathcal L}{\partial X^m} &= \frac{\partial}{\partial X^m}\left(-m\sqrt F \right) = \left(\frac{-m}{2 F}\right)\frac{\partial \sqrt F}{\partial X^m}\\ & = \left(\frac{m}{2 \sqrt F}\right)\partial_m g_{\mu\nu} \dot{X}^\mu \dot{X}^\nu \end{aligned}
Again, we substitute \sqrt F = \frac{\mathrm d\tau}{\mathrm dt} for the denominator and also re-express the velocities \dot{X}^\mu in terms of proper time (\dot{X}^\mu = \frac{\mathrm dX^\mu}{\mathrm d\tau} \frac{\mathrm d\tau}{\mathrm dt}):
\begin{aligned} \frac{\partial \mathcal L}{\partial X^m} &= \frac{m}{2 \frac{\mathrm d\tau}{\mathrm dt}} \partial_m g_{\mu\nu} \left( \frac{\mathrm dX^\mu}{\mathrm d\tau} \frac{\mathrm d\tau}{\mathrm dt} \right) \left( \frac{\mathrm dX^\nu}{\mathrm d\tau} \frac{\mathrm d\tau}{\mathrm dt} \right) \\ &= \frac{m}{2} \partial_m g_{\mu\nu} \frac{\mathrm dX^\mu}{\mathrm d\tau} \frac{\mathrm dX^\nu}{\mathrm d\tau} \frac{\mathrm d\tau}{\mathrm dt} \end{aligned}
Now we equate the left and right sides of the Euler-Lagrange equation and cancel the common factor of m:
\partial_\sigma g_{m\nu} \frac{\mathrm dX^\sigma}{\mathrm dt} \frac{\mathrm dX^\nu}{\mathrm d\tau} + g_{m\nu} \frac{\mathrm d^2 X^\nu}{\mathrm d\tau^2} \frac{\mathrm d\tau}{\mathrm dt} = \frac{1}{2} \partial_m g_{\mu\nu} \frac{\mathrm dX^\mu}{\mathrm d\tau} \frac{\mathrm dX^\nu}{\mathrm d\tau} \frac{\mathrm d\tau}{\mathrm dt}
To simplify, we multiply the entire equation by \frac{\mathrm dt}{\mathrm d\tau} and use \frac{\mathrm dX^\sigma}{\mathrm dt} = \frac{\mathrm dX^\sigma}{\mathrm d\tau} \frac{\mathrm d\tau}{\mathrm dt}.
\partial_\sigma g_{m\nu} \frac{\mathrm dX^\sigma}{\mathrm d\tau} \frac{\mathrm dX^\nu}{\mathrm d\tau} + g_{m\nu} \frac{\mathrm d^2 X^\nu}{\mathrm d\tau^2} = \frac{1}{2} \partial_m g_{\mu\nu} \frac{\mathrm dX^\mu}{\mathrm d\tau} \frac{\mathrm dX^\nu}{\mathrm d\tau}
Isolating the second derivative term gives:
g_{m\nu} \frac{\mathrm d^2 X^\nu}{\mathrm d\tau^2} = \frac{1}{2} \partial_m g_{\mu\nu} \frac{\mathrm dX^\mu}{\mathrm d\tau} \frac{\mathrm dX^\nu}{\mathrm d\tau} - \partial_\sigma g_{m\nu} \frac{\mathrm dX^\sigma}{\mathrm d\tau} \frac{\mathrm dX^\nu}{\mathrm d\tau}
We can manipulate the indices:
\begin{aligned} g_{m\nu} \frac{\mathrm d^2 X^\nu}{\mathrm d\tau^2} & = \frac{1}{2} \partial_m g_{\mu\nu} \frac{\mathrm dX^\mu}{\mathrm d\tau} \frac{\mathrm dX^\nu}{\mathrm d\tau} - \partial_\sigma g_{m\nu} \frac{\mathrm dX^\sigma}{\mathrm d\tau} \frac{\mathrm dX^\nu}{\mathrm d\tau}\\ & = \frac{1}{2} \partial_m g_{\sigma\nu} \frac{\mathrm dX^\sigma}{\mathrm d\tau} \frac{\mathrm dX^\nu}{\mathrm d\tau} - \partial_\sigma g_{m\rho} \frac{\mathrm dX^\sigma}{\mathrm d\tau} \frac{\mathrm dX^\rho}{\mathrm d\tau}\\ & = \frac{1}{2} \partial_m g_{\sigma\rho} \frac{\mathrm dX^\sigma}{\mathrm d\tau} \frac{\mathrm dX^\rho}{\mathrm d\tau} - \partial_\sigma g_{m\rho} \frac{\mathrm dX^\sigma}{\mathrm d\tau} \frac{\mathrm dX^\rho}{\mathrm d\tau}\\ & = \left( \frac{1}{2} \partial_m g_{\sigma\rho} - \partial_\sigma g_{m\rho} \right) \frac{\mathrm dX^\sigma}{\mathrm d\tau} \frac{\mathrm dX^\rho}{\mathrm d\tau} \\ &= \frac{1}{2} \left( \partial_m g_{\sigma\rho} - \partial_\sigma g_{m\rho} - \partial_\sigma g_{m\rho} \right) \frac{\mathrm dX^\sigma}{\mathrm d\tau} \frac{\mathrm dX^\rho}{\mathrm d\tau} \end{aligned}
We have -\partial_\sigma g_{m\rho} = \frac{1}{2}\left(- \partial_\sigma g_{m\rho} - \partial_\sigma g_{m\rho} \right) that can be written as (just changing the indices):
\partial_\sigma g_{m\rho} \frac{\mathrm dX^\sigma}{\mathrm d\tau} \frac{\mathrm dX^\rho}{\mathrm d\tau} = \partial_a g_{m b} \frac{\mathrm dX^a}{\mathrm d\tau} \frac{\mathrm dX^b}{\mathrm d\tau} = \partial_\rho g_{m \sigma} \frac{\mathrm dX^\rho}{\mathrm d\tau} \frac{\mathrm dX^\sigma}{\mathrm d\tau} = \partial_\rho g_{m \sigma} \frac{\mathrm dX^\sigma}{\mathrm d\tau} \frac{\mathrm dX^\rho}{\mathrm d\tau}
So:
g_{m\nu} \frac{\mathrm d^2 X^\nu}{\mathrm d\tau^2} = \frac{1}{2} \left(\partial_m g_{\sigma\rho} - \partial_\sigma g_{m\rho} -\partial_\rho g_{m\sigma} \right) \frac{\mathrm dX^\sigma}{\mathrm d\tau} \frac{\mathrm dX^\rho}{\mathrm d\tau}
Finally, we multiply by the inverse metric g^{\mu m}:
\begin{aligned} & g^{\mu m} g_{m\nu} \frac{\mathrm d^2 X^\nu}{\mathrm d\tau^2} = {\delta^\mu}_\nu\frac{\mathrm d^2 X^\nu}{\mathrm d\tau^2} = \frac{\mathrm d^2 X^\mu}{\mathrm d\tau^2} \\ & = \frac{1}{2} g^{\mu m} \left(\partial_m g_{\sigma\rho} - \partial_\sigma g_{m\rho} -\partial_\rho g_{m\sigma} \right) \frac{\mathrm dX^\sigma}{\mathrm d\tau} \frac{\mathrm dX^\rho}{\mathrm d\tau} \end{aligned}
The term on the right involving the metric derivatives is precisely the definition of the Christoffel symbol {\Gamma^\mu}_{\sigma\rho}, with a negative sign:
{\Gamma^\mu}_{\sigma\rho} = \frac{1}{2} g^{\mu m} \left(\partial_\sigma g_{m\rho} + \partial_\rho g_{m\sigma} - \partial_m g_{\sigma\rho} \right)
This leaves us with the final expression:
\frac{\mathrm d^2 X^\mu}{\mathrm d\tau^2} = - {\Gamma^\mu}_{\sigma\rho}\frac{\mathrm dX^\sigma}{\mathrm d\tau}\frac{\mathrm dX^\rho}{\mathrm d\tau}
This result confirms that the path of least action for a free particle in a curved spacetime is the geodesic.
Finally, when we use this Lagrangian, the Euler-Lagrange equations will of course simply produce Newton’s equation for a particle in a gravitational field U(X), just as it did when we carried out exactly the same calculations in volume 1.
We arrive at
m \ddot X = -m \frac{\partial U}{\partial X}
The approximated action is:
A = \int \left(mc^2 - mU(X) + \frac{m}{2}\dot X^2\right) \mathrm dt
We can recognize the Lagrangian in the form kinetic energy minus potential energy (with a constant term mc^2 which is constant and therefore is not entering in the equation):
\mathcal L = T - U
We can apply the Euler-Lagrange equation:
\frac{\mathrm d}{\mathrm dt} \left(\frac{\partial \mathcal L}{\partial \dot X^m}\right) = \frac{\partial \mathcal L}{\partial X^m}
For the left hand side, there is only one term that depends from \dot X:
\frac{\mathrm d}{\mathrm dt} \left(\frac{\partial \mathcal L}{\partial \dot X^m}\right) = \frac{\mathrm d}{\mathrm dt} \left(m \dot X\right) = m \ddot X
For the right hand side, there is only one term that depends from X:
\frac{\partial \mathcal L}{\partial X^m} = -m \frac{\partial U(X)}{\partial X}
Putting all together we derive the equation of motions:
m \ddot X = -m \frac{\partial U(X)}{\partial X}
This is the standard Newton equation for a particle in a gravitational field U(X).
Since there is a mass on both side, it cancels out:
\ddot X = -\frac{\partial U(X)}{\partial X}
so the motion does not depends from the mass of the object.
In our case it becomes simply \mathcal H = p \dot r - \mathcal L. The calculation are left to the reader. The result is
\mathcal H = \frac{m (1 - 2MG/r)}{\sqrt{(1 - 2MG/r) -\dot r/(1- 2MG/r)}} \tag{36}
[…]
Then equation (36), giving the energy, enable us to express \dot r as a function of that energy E. With some algebra we get
\dot r^2 = \left(1 - \frac{2MG}{r}\right)^2 - \frac{\left(1 - \frac{2MG}{r}\right)^3}{E^2} \tag{37}
Note: there is a typo in equation (36). \dot r should be \dot r^2.
The Lagrangian \mathcal L for this system is:
\mathcal L = -m \sqrt{\left(1 - \frac{2MG}{r}\right) - \left(\frac{1}{1 - \frac{2MG}{r}}\right) \dot r^2} \tag{35}
For easy of writing we set:
\mathcal F(r) \equiv 1 - \frac{2MG}{r}
So the Lagrangian is:
\mathcal L = -m\sqrt{\mathcal F(r) - \mathcal F^{-1}(r) \dot r^2}
The Hamiltonian \mathcal H is defined as function of the generalized coordinates q and \dot q and the Lagrangian as:
\mathcal H = \sum_i p_i \dot q_i - \mathcal L
In our case the generalized coordinates are only r and \dot r so we have:
\mathcal H = p \dot r - \mathcal L
To compute it, the first step is to calculate the generalized conjugate momentum of r:
p = \frac{\partial \mathcal L}{\partial \dot r}
Using the definition of \mathcal L:
\begin{aligned} p & = \frac{\partial \mathcal L}{\partial \dot r} \\ & = \frac{\partial }{\partial \dot r} \left[-m \sqrt{\mathcal F(r) - \mathcal F^{-1}(r) \dot r^2} \right] \\ & = -m\frac{1}{2}\frac{ -\mathcal F^{-1}(r)2\dot r}{\sqrt{\mathcal F(r) - \mathcal F^{-1}(r) \dot r^2}} \\ & = \frac{m\dot r}{\mathcal F(r)\sqrt{\mathcal F(r) - \mathcal F^{-1}(r) \dot r^2}} \end{aligned}
We can now compute the Hamiltonian:
\begin{aligned} \mathcal H & = \frac{m\dot r^2}{\mathcal F(r)\sqrt{\mathcal F(r) - \mathcal F^{-1}(r) \dot r^2}} + m\sqrt{\mathcal F(r) - \mathcal F^{-1}(r) \dot r^2} \\ & = \frac{m\dot r^2 + m \mathcal F(r) \left(\mathcal F(r) - \mathcal F^{-1}(r) \dot r^2\right)}{\mathcal F(r)\sqrt{\mathcal F(r) - \mathcal F^{-1}(r) \dot r^2}} \\ & = \frac{m\dot r^2 + m \mathcal F(r)^2 - m\dot r^2}{\mathcal F(r)\sqrt{\mathcal F(r) - \mathcal F^{-1}(r) \dot r^2}} \\ & = \frac{m \mathcal F(r)^2}{\mathcal F(r)\sqrt{\mathcal F(r) - \mathcal F^{-1}(r) \dot r^2}} \\ & = \frac{m \mathcal F(r)}{\sqrt{\mathcal F(r) - \mathcal F^{-1}(r) \dot r^2}} \\ & = \frac{m \left(1 - \frac{2MG}{r}\right)}{\sqrt{\left(1 - \frac{2MG}{r}\right) - \frac{1}{\left(1 - \frac{2MG}{r}\right)} \dot r^2}} \end{aligned}
The Hamiltonian \mathcal H represents the total conserved energy of the particle. In the physics of gravitation, it is standard to work with quantities expressed per unit mass. This simplifies the equations by removing the test mass m from the dynamics. We define the specific energy E as the Hamiltonian per unit mass:
E = \frac{\mathcal H}{m}
We can now isolate \dot r^2:
\begin{aligned} & E^2 = \frac{\mathcal F(r)^2}{\mathcal F(r) - \mathcal F^{-1}(r) \dot r^2} \\ & \mathcal F(r) - \mathcal F^{-1}(r) \dot r^2 = \frac{\mathcal F(r)^2}{E^2} \\ & \mathcal F^{-1}(r) \dot r^2 = \mathcal F(r) - \frac{\mathcal F(r)^2}{E^2} \\ & \dot r^2 = \mathcal F(r)^2 - \frac{\mathcal F(r)^3}{E^2} \\ & \dot r^2 = \left(1 - \frac{2MG}{r}\right)^2 - \frac{\left(1 - \frac{2MG}{r}\right)^3}{E^2} \end{aligned}
Show that from equation (36) for the energy, and equation (37) for \dot r^2, it follows that
\dot r \approx \sqrt{\frac{r - 2MG}{2MG}} \quad \text{as } r \to 2MG \tag{38}
Note: there is a typo in equation (38). There shouldn’t be the square root \dot r \approx \frac{r - 2MG}{2MG}.
We use the expansion of (1-\frac{A}{x}) about x = A using a Taylor expansion. Let x = A + \varepsilon with \varepsilon \to 0:
1 - \frac{A}{x} = 1 - \frac{A}{A + \varepsilon} = 1 - \frac{A}{A (1 + \frac{\varepsilon}{A})} = 1 - \frac{1}{1 + \frac{\varepsilon}{A}}
We expand 1 / (1 + z) for small z:
\frac{1}{1 + z} \approx 1 - z + z^2 - \cdots
So,
1 - \frac{1}{1 + \frac{\varepsilon}{A}} \approx 1 - [1 - \frac{\varepsilon}{A} + \left(\frac{\varepsilon}{A}\right)^2 - \cdots] = \frac{\varepsilon}{A} - \left(\frac{\varepsilon}{A}\right)^2 + \cdots
Since x = A + \varepsilon,
\varepsilon = x - A
We have:
1 - \frac{A}{x} = \frac{x - A}{A} + \mathcal O((x-A)^2)
In our case then:
1 - \frac{2MG}{r} = \frac{r-2MG}{2MG} + \mathcal O(r - 2MG)^2
Substituting:
\begin{aligned} & \dot r^2 = \left(1 - \frac{2MG}{r}\right)^2 - \frac{\left(1 - \frac{2MG}{r}\right)^3}{E^2} \\ & \dot r^2 = \left(\frac{r-2MG}{2MG}\right)^2 - \frac{\left(\frac{r-2MG}{2MG}\right)^3}{E^2} + \mathcal O(r - 2MG)^4 \\ & \dot r^2 = \left(\frac{r-2MG}{2MG}\right)^2 + \mathcal O(r - 2MG)^3 \\ & \dot r \approx \left(\frac{r-2MG}{2MG}\right)^2 \\ & \dot r \approx \frac{r-2MG}{2MG} \end{aligned}
So we can plot it in the \dot r - r plane.
The radial velocity of the object appear to approach zero linearly as it nears the event horizon.
If we calculated the curvature tensor to find out how curved the geometry is as the center, we would find that all of its components become infinite.
The exercise has been analyzed in detail in a specific page here.
To compare results with the standard literature, we used the formulation with \sin^2(\theta) in place of \cos^2(\theta) and we use colatitude in place of the latitude used in the book.
In the book, the curvature tensor is presented as:
{R^t}_{srn} = \partial_r \Gamma^t_{sn} - \partial_s {\Gamma^t}_{rn} + {\Gamma^p}_{sn}{\Gamma^t}_{pr} - {\Gamma^p}_{rn}{\Gamma^t}_{ps}
We should note that the book is using the use “sr” indices for derivatives, while standard literature put the two derivatives at the end.
The tensor should then be written (to follow the standard literature) as:
{R^t}_{nsr} = \partial_r{\Gamma^t}_{sn} - \partial_s {\Gamma^t}_{rn} + {\Gamma^p}_{sn}{\Gamma^t}_{pr} - {\Gamma^p}_{rn} {\Gamma^t}_{ps}
We take the mapping t\to\mu, n\to\nu, s\to\rho, r\to\sigma, p\to\lambda:
{R^\mu}_{ \nu\rho\sigma} = \partial_\sigma{\Gamma^\mu}_{ \nu\rho} -\partial_\rho{\Gamma^\mu}_{ \nu\sigma} +{\Gamma^\mu}_{ \lambda\sigma}{\Gamma^\lambda}_{ \nu\rho} -{\Gamma^\mu}_{ \lambda\rho}{\Gamma^\lambda}_{ \nu\sigma}
Using {\Gamma^\alpha}_{\beta\gamma}={\Gamma^\alpha}_{ \gamma\beta} and commutativity of multiplication, this is equal to:
{R^\mu}_{ \nu\rho\sigma} = -\left[ \partial_\rho{\Gamma^\mu}_{ \nu\sigma} -\partial_\sigma{\Gamma^\mu}_{ \nu\rho} +{\Gamma^\mu}_{ \lambda\rho}{\Gamma^\lambda}_{ \nu\sigma} -{\Gamma^\mu}_{ \lambda\sigma}{\Gamma^\lambda}_{ \nu\rho} \right]
In the computation, we will use the opposite convention:
{R^\mu}_{\nu\rho\sigma} =\partial_\rho{\Gamma^\mu}_{\nu\sigma}-\partial_\sigma{\Gamma^\mu}_{\nu\rho} +{\Gamma^\mu}_{\lambda\rho}{\Gamma^\lambda}_{\nu\sigma} -{\Gamma^\mu}_{\lambda\sigma}{\Gamma^\lambda}_{\nu\rho}
These convention have an opposite sign because the book adopts the alternative sign convention, common in mathematics texts, where the curvature is defined through:
[\nabla_\rho,\nabla_\sigma] V^\mu = -\,{R^\mu}_{\nu\rho\sigma} V^\nu
while in physics literature the more standard choice is:
[\nabla_\rho,\nabla_\sigma] V^\mu = {R^\mu}_{\nu\rho\sigma} V^\nu
Every independent component of the Riemann tensor differs by an overall minus sign between the two conventions.
In our case, there are only two coordinates, r and \phi. From equation (13), we can calculate \mathcal H.
From:
\mathcal H = \sum_i p_i \dot q_i - \mathcal L \tag{13}
And using the generalized coordinates r and \phi:
\begin{aligned} & p_r = P_r = \frac{\partial \mathcal L}{\partial \dot r} = \frac{m \mathcal F^{-1}(r) \dot r}{\sqrt{\mathcal F(r) - \mathcal F^{-1}(r) \dot r^2 - r^2 \dot \phi^2}}\\ & p_\phi = L = \frac{\partial \mathcal L}{\partial \phi} = \frac{mr^2 \dot \phi}{\sqrt{\mathcal F(r) - \mathcal F^{-1}(r) \dot r^2 - r^2 \dot \phi^2}} \end{aligned}
We can compute the Hamiltonian:
\begin{aligned} \mathcal H = & \sum_i p_i \dot q_i - \mathcal L \\ = & P_r \dot r + L \dot \phi - \mathcal L \\ = & \frac{m \mathcal F^{-1}(r)\dot r^2}{\sqrt{\mathcal F(r) - \mathcal F^{-1}(r) \dot r^2 - r^2 \dot \phi^2}} \\ & + \frac{mr^2 \dot \phi^2}{\sqrt{\mathcal F(r) - \mathcal F^{-1}(r) \dot r^2 - r^2 \dot \phi^2}} \\ & + m \sqrt{\mathcal F(r) - \mathcal F^{-1}(r) \dot r^2 - r^2 \dot \phi^2} \\\ & = \frac{m \mathcal F^{-1}(r)\dot r^2 + mr^2 \dot \phi^2 + m\mathcal F(r) - m\mathcal F^{-1}(r) \dot r^2 - mr^2 \dot \phi^2}{\sqrt{\mathcal F(r) - \mathcal F^{-1}(r) \dot r^2 - r^2 \dot \phi^2}} \\ & = \frac{m\mathcal F(r)}{\sqrt{\mathcal F(r) - \mathcal F^{-1}(r) \dot r^2 - r^2 \dot \phi^2}} \end{aligned}
First of all we solve equation (16) for \dot \phi. This will give us the angular velocity as a function of the angular momentum. It is very easy and it is left to the reader. Then we plug it back into equation (15) for the energy. The result will be an expression of the energy as function of r and the angular momentum.
Equation (15) is:
E = \frac{m\mathcal F(r)}{\sqrt{\mathcal F(r) - r^2 \dot \phi^2}} \tag{15}
Equation (16) is:
L = \frac{mr^2 \dot \phi}{\sqrt{\mathcal F(r) - r^2 \dot \phi^2}} \tag{16}
First we solve equation (16) for \dot \phi:
\begin{aligned} & L^2 = \frac{m^2r^4 \dot \phi^2}{\mathcal F(r) - r^2 \dot \phi^2} \\ & L^2 \left(\mathcal F(r) - r^2 \dot \phi^2\right) = m^2r^4 \dot \phi^2 \\ & L^2 \mathcal F(r) - L^2 r^2 \dot \phi^2 - m^2r^4 \dot \phi^2 = 0 \\ & L^2 r^2 \dot \phi^2 + m^2r^4 \dot \phi^2 = L^2 \mathcal F(r) \\ & r^2 \dot \phi^2 + \frac{m^2r^4}{L^2} \dot \phi^2 = \mathcal F(r) \\ & \dot \phi^2 = \frac{\mathcal F(r)}{r^2 + \frac{m^2r^4}{L^2}} = \frac{\left(\frac{L}{m}\right)^2 \mathcal F(r)}{r^2\left(\frac{L}{m}\right)^2 + r^4} \end{aligned}
Then we can replace this back in the energy equation:
\begin{aligned} E & = \frac{m\mathcal F(r)}{\sqrt{\mathcal F(r) - r^2 \dot \phi^2}} \\ & = \frac{m\mathcal F(r)}{\sqrt{\mathcal F(r) - r^2 \frac{\left(\frac{L}{m}\right)^2 \mathcal F(r)}{r^2\left(\frac{L}{m}\right)^2 + r^4}}} \\ & = \frac{m\mathcal F(r)}{\sqrt{\mathcal F(r)\left( 1 - r^2 \frac{\left(\frac{L}{m}\right)^2 \mathcal F(r)}{r^2\left(\frac{L}{m}\right)^2 + r^4}\right)}} \\ & = m\frac{\sqrt{\mathcal F(r)}}{\sqrt{1 - \frac{r^2\left(\frac{L}{m}\right)^2}{r^2\left(\frac{L}{m}\right)^2 + r^4}}} \\ & = m\frac{\sqrt{\mathcal F(r)}}{\sqrt{\frac{r^2\left(\frac{L}{m}\right)^2 + r^4 - r^2\left(\frac{L}{m}\right)^2}{r^2\left(\frac{L}{m}\right)^2 + r^4}}} \\ & = m\frac{\sqrt{\mathcal F(r)}}{\sqrt{\frac{r^4}{r^2\left(\frac{L}{m}\right)^2 + r^4}}} \\ & = m\frac{\sqrt{\mathcal F(r)}\sqrt{r^2\left(\frac{L}{m}\right)^2 + r^4}}{\sqrt{r^4}} \\ & = m\frac{\sqrt{\mathcal F(r)}\sqrt{r^2\left(\frac{L}{m}\right)^2 + r^4}}{r^2} \end{aligned}
So the energy is:
E = m \frac{\sqrt{\mathcal F} \sqrt{r^2\left(\frac{L}{m}\right)^2 + r^4}}{r^2}
Elementary calculations left to the reader show that the point where E is stationary is at
r = 3MG
The equation of the energy for a photon is:
E = L\frac{\sqrt{1 - \frac{2MG}{r}}}{r}
We can find a stationary point as:
\begin{aligned} \frac{\mathrm dE}{\mathrm dr} & = \frac{\mathrm d}{\mathrm dr} \left(L\frac{\sqrt{1 - \frac{2MG}{r}}}{r}\right) \\ & = -L\frac{\sqrt{1 - \frac{2MG}{r}}}{r^2} + L\frac{1}{2}\frac{2MG}{\sqrt{1 - \frac{2MG}{r}}r^2} \\ & = -L\frac{\sqrt{1 - \frac{2MG}{r}}}{r^2} + L\frac{MG}{\sqrt{1 - \frac{2MG}{r}}r^3} \\ & = L\frac{-\sqrt{1 - \frac{2MG}{r}} \sqrt{1 - \frac{2MG}{r}}r + MG}{\sqrt{1 - \frac{2MG}{r}}r^3} \\ & = L\frac{- r + 2MG + MG}{\sqrt{1 - \frac{2MG}{r}}r^3} \\ & = L\frac{- 1 + \frac{3MG}{r}}{\sqrt{1 - \frac{2MG}{r}}r^2} = 0 \end{aligned}
Since we are outside the Schwarzschild radius, the denominator is \ne 0, then:
\begin{aligned} & L \left(- 1 + \frac{3MG}{r}\right) = 0 \\ & r = 3MG \end{aligned}
and doesn’t depends from the angular momentum.
To verify if it is a minimum or a maximum (so if it is stable or unstable equilibrium) we compute the second derivative and evaluate it at the stationary point:
\frac{\mathrm d^2E}{\mathrm dr^2} = L\frac{15(MG)^2 - 12 MG r + 2r^2}{r^{7/2}(r-2MG)^{3/2}}
At r=3MG:
\begin{aligned} \left.\frac{\mathrm d^2E}{\mathrm dr^2}\right|_{r=3MG} & = L\frac{15(MG)^2 - 36 (MG)^2 + 18 (MG)^2}{(3MG)^{7/2}(3MG-2MG)^{3/2}} \\ & = \frac{-3L(MG)^2}{(3MG)^{7/2}(3MG-2MG)^{3/2}} \\ & = \frac{-3L(MG)^2}{(3MG)^{7/2}(MG)^{3/2}} \\ & = \frac{-3L(MG)^2}{3^{7/2}(MG)^5} \\ & = \frac{-3L}{3^3\sqrt{3}M^3G^3} \\ & = \frac{-L}{9\sqrt{3}M^3G^3} \\ & = -\frac{\sqrt{3}L}{27G^3M^3}<0 \end{aligned}
A negative curvature at the stationary point implies a local maximum of E(r), so the circular photon orbit at r=3MG is an unstable equilibrium.
We recommend that the reader compare the calculations we did with the corresponding Newtonian calculations. You will see that all the pieces are the same. But the outcome is quite different.
In Newtonian mechanics, the motion of a particle of mass m in a central gravitational potential created by a mass M is described, in polar coordinates (r,\phi), by the Lagrangian:
\mathcal L = \frac{1}{2} m\left(\dot r^2 + r^2 \dot \phi^2\right) + \frac{GMm}{r}
The canonical momentum conjugate to \phi is:
p_\phi = \frac{\partial \mathcal L}{\partial \dot \phi} = m r^2 \dot \phi
This is the angular momentum L and since \phi is a cyclic coordinate, L is conserved.
The radial momentum is:
p_r = \frac{\partial \mathcal L}{\partial \dot r} = m \dot r
Unlike L, this depends explicitly on r and is not conserved.
The corresponding Hamiltonian is the total energy of the system:
\mathcal H = \sum_i p_i \dot q_i - \mathcal L = \frac{1}{2} m \dot r^2 + \frac{1}{2} m r^2 \dot \phi^2 - \frac{GMm}{r}
which is the sum of radial kinetic energy, angular kinetic energy, and gravitational potential energy.
We can rewrite the Hamiltonian using L:
E = \frac{1}{2} m \dot r^2 + \frac{L^2}{2 m r^2} - \frac{GMm}{r}
The term:
V_{\text{eff}}(r) = \frac{L^2}{2 m r^2} - \frac{GMm}{r}
is called the effective potential. It contains the centrifugal barrier (L^2/2mr^2) and the gravitational attraction (-GMm/r).
Therefore, the motion can be seen as one-dimensional in r under the effective potential:
E = \frac{1}{2} m \dot r^2 + V_{\text{eff}}(r)
Circular orbits correspond to stationary points of the effective potential. Setting \dot r=0, the condition for equilibrium is:
\frac{dV_{\text{eff}}}{dr} = -\frac{L^2}{m r^3} + \frac{GMm}{r^2} = 0
which gives the orbital radius:
r_c = \frac{L^2}{GM m^2}
At this radius the angular velocity is:
\dot \phi = \frac{L}{m r_c^2} = \sqrt{\frac{GM}{r_c^3}}
This reproduces Kepler’s third law.
The energy of the circular orbit is:
E_c = V_{\text{eff}}(r_c) = -\frac{GMm}{2 r_c}
a negative quantity, confirming that circular orbits are bound states.
To check the stability, we compute the second derivative of the effective potential:
\frac{d^2 V_{\text{eff}}}{dr^2} = \frac{3L^2}{m r^4} - \frac{2GMm}{r^3}
Substituting the circular orbit radius r_c:
\left.\frac{d^2 V_{\text{eff}}}{dr^2}\right|_{r_c} = \frac{GMm}{r_c^3} > 0
Therefore, all Newtonian circular orbits are stable, in contrast to general relativity where unstable photon orbits exist at r=3GM.
It is possible to find the full trajectory by eliminating time. Let’s define the reciprocal variable:
u = \frac{1}{r}
The orbit equation takes the form:
\frac{\mathrm d^2 u}{\mathrm d\phi^2} + u = \frac{GM m^2}{L^2}
The general solution is:
u(\phi) = \frac{GM m^2}{L^2}\left[1 + e \cos(\phi - \phi_0)\right]
so that:
r(\phi) = \frac{p}{1 + e \cos(\phi - \phi_0)},
with:
p = \frac{L^2}{GM m^2}.
This is the equation of a conic section, parametrized by the eccentricity e.
To connect e with the constants of motion E and L, we use the result for bound ellipses:
E = -\frac{GMm}{2a}
where a is the semi-major axis.
From the conic geometry,
\begin{aligned} & a = \frac{p}{1-e^2} \\ & p = \frac{L^2}{GM m^2} \end{aligned}
Substituting:
E = -\frac{GMm}{2a} = -\frac{(GM)^2 m^3}{2 L^2}(1-e^2)
Solving for e^2:
e^2 = 1 + \frac{2 E L^2}{(GM)^2 m^3}.
The sign of the energy determines the orbit type:
There is then significant differences, in Newtonian gravity, circular orbits exist at any radius (depending on L), all are stable and there isn’t the concept of photon orbits.
In general relativity using the Schwarzschild metric, photons can orbit only at r=3GM, but this is unstable.
Explain why a light ray emitted from inside the photon sphere can escape, but a light ray cannot enter the photon sphere and come out again.
For a photon the energy is:
E = L \frac{\sqrt{1-2MG/r}}{r}
So the situation is governed by a potential energy barrier.
The equations show there is an effective potential energy “hill” for photons, and its peak is at the photon sphere radius, r=3MG. The height of this peak depends on the photon’s angular momentum, L.
Photon coming from outside
A photon from infinity has a fixed, conserved energy. To reach the photon sphere, its energy must be enough to get to the top of the potential hill. Once it crosses to the inside (r < 3MG), it is on the side of the hill that slopes down towards the black hole. Since its energy is conserved, it cannot climb back up the hill it just came over. There is no force to turn it around, so it is captured.
Photon created inside
A photon created inside the sphere (r < 3MG) with an outward direction starts partway up the hill. If it is emitted with an energy greater than the peak of the hill, it has the necessary energy to travel outwards, clear the peak at 3MG, and escape to infinity.