The Radiation Field Done Right

Last updated 2016-10-14 19:52:22 SGT

I'm quite annoyed with how most textbooks targeted at physicists treat the conservation properties of the radiation field as a collection of ad-hoc approximations occasionally supplemented by appeals to elegance, real or imagined. Inevitably this leads to arguments centred around counting factors of $\cos \theta$, which (in my opinion) is pedagogically regressive — most of these arguments are constructed post-hoc, invented to fit existing, known-correct expressions. In other words these arguments do not permit the reader to comprehend the expressions, merely to accept them. This drove me to no end of frustration, mostly because counting factors of $\cos \theta$ doesn't always work if you try to be more rigorous and avoid the use of differentials.1 Here's how I understand them instead.

Preliminaries: symplectic manifolds.

We take it to be a fundamental property of classical mechanics that phase space has the structure of a symplectic manifold. 2 We also take for granted that in the ray-optics limit photons obey classical mechanics (obviously this precludes the use of this formalism in studying diffraction phenomena, but that is also a fundamental limitation of the radiation-field formalism).

Definition (symplectic manifold): A $2n$-dimensional manifold $\mathcal{P}$ is symplectic iff it is equipped with a closed non-degenerate differential 2-form $\Omega$.

Definition (Hamiltonian vector field): The Hamiltonian vector field $X_H: \mathcal{P} \to T(\mathcal{P})$ of a function $H: \mathcal{P} \to \mathbb{R}$ satisfies $\mathrm{d} H = \iota_{X_H} \Omega$.

It is always possible to find coordinates $\{q^i, p^i\}$ such that $X_H = {\partial H \over \partial p^i} \partial_{q^i} - {\partial H \over \partial q^i} \partial_{p^i}$, from which a coordinate expression for $\Omega$ can be inferred. We can write $\Omega = \sum_{i} \mathrm{d}q^i \wedge \mathrm{d}p^i = - \mathrm{d} \alpha$, where $\alpha = \sum_i p_i \mathrm{d}q^i$ is the tautological one-form.

Ancillary to these definitions are two other important properties of classical mechanics that we will rely upon:

1. Lemma (symplectomorphisms): The flow of a Hamiltonian vector field $X_H$ (i.e. time evolution under a Hamiltonian $H$) generates a one-parameter family of automorphisms $\rho_t: \mathcal{P} \to \mathcal{P}$ satisfying $\rho_t^* \Omega = \Omega \iff \mathcal{L}_{X_H}\Omega = 0$.

This is what causes the Poisson-bracket structure of classical mechanics to be time-invariant if the Hamiltonian is also time-independent, for example.

2. Theorem (Liouville's Theorem): If $\mathcal{P}$ is equipped with a volume form $\epsilon$, then in classical mechanics this volume form is invariant under symplectomorphisms: $\mathcal{L}_{X_H} \epsilon = 0 \iff \rho_t^* \epsilon = \epsilon \implies \mathrm{div}~X_H = 0$.

If $\mathcal{P}$ is equipped with metric structure, then we can also consider statements about Hodge duals. In particular, for Hamiltonian vector fields, 3

In addition to the Hodge dual, we observe that we can endow the cotangent bundle with Sasakian structure; that is to say, the metric over the coordinate directions $\{q^i\}$ and $\{p^i\}$ is block-diagonalisable into two identical matrices. If we so wished, we could then take Hodge duals separately over the configuration and momentum subalgebras. Let us denote these by $\star_q$ and $\star_p$.

Physics

Let us construct a phase space energy density $I \epsilon$ (in units of $c = h = 1$). If we work in flat spacetime (especially Cartesian coordinates) such that $L_{X_H} g = 0$, the above properties imply:

Flux and the Conservation of Étendue

Consider an ensemble of single particles with momentum $p$ travelling in 3 dimensions. Liouville's Theorem implies $\rho_t^*I(x) = I(\rho_t^{-1}(x))$. We see that $I$ (proportional to the "surface brightness") is constant along flow lines., and is proportional to the energy density. Since its momentum is fixed, the energy density is zero for all momenta other than on a sphere in momentum space (i.e. a Dirac delta spherical shell distribution), and integrating across momentum space to recover the energy density is equivalent to taking a solid angle integral. Thus we obtain that this function $I$ is related to the surface brightness $I_\nu$ in the usual physical units as $p^2 I = {I_\nu \over c}$. Compare this to the usual expression

We also consider in particular

Recall that $\Omega$ is a 2-form. In 3 spatial dimensions, its Hodge dual is a 4-form:

These oriented surface elements are exactly those we are required to integrate over to express the conservation of étendue. That it is invariant under symplectomorphisms implies that its integral is a conserved quantity, if at different $t$ we integrate over the image of the support of $I$ (i.e. the support of $\rho_t^* I$). For example, consider the bisurface element in the $z$ direction:

This yields one factor of $\cos \theta$ if we instead integrate $c I \star \Omega$ to obtain the flux through a surface, whence arises the definition

Note that the $\cos \theta$ only emerges as a consequence of our choice of coordinate system. The volume form and 4-form work properly in general (the usual integral over $\nu$ becomes an integral over $p$), in a fully coordinate-independent manner.

Pressure and the dual tautological one-form

We want the pressure to be indicative of momentum transfer along a plane. Therefore, in coordinates, we must have that the pressure is proportional to the integral of

Resolving momentum space into spherical coordinates again yields both a solid angle integral with a factor of $\cos \theta$, as above, and an additional factor of $\cos \theta$ from the spatial part of the 2-form as $\varepsilon^{ijk} p_i \mathrm{d} x_j \mathrm{d} x_k~\hat{=}~\mathbf{p} \cdot \mathrm{d}\vec{S}$. Compare this with the usual definition

Observe that the above expression can be written as being proportional to $I \star d\beta$ with $\beta = \sum_i {1 \over 2}(p_i)^2 \mathrm{d}q^i$ in coordinates.

The photography integral

Finally, we will examine a system without such a symplectic structure. Consider a camera with length $L$. Using the main axis of the camera as the axis of a spherical coordinate system centered on a point on the focusing screen, we can set up a coordinate system on the input (e.g. lens) plane of the camera relative to this point as

We compute the Jacobian determinant as

It follows that for a given point on the focal plane, the flux is found as

where $(\theta, \phi)$ are implicit functions of $(x,y)$. This factor of $\cos^4 \theta$ is usually explained away by a very long paragraph that requires way more reliance on geometric intuition than I find acceptable (I don't trust my intuition to count to higher than 2). If we were interested in finding the radiation pressure instead of the flux, we'd instead end up with a factor in the integral of $\cos^5 \theta$! I don't think I'd feel comfortable with any attempt at a geometrical explanation of that.

Concluding remarks

It's remarkable how much uglier this looks than would be presented in most textbooks (where these physical quantities are presented as “moments” with respect to $\cos \theta$). This is unfortunate, but one has to realise that such elegance is purely a coincidence. That such a formulation is specious becomes apparent when one notices higher “moments” than order 2 have no real physical meaning. These factors of $\cos \theta$ emerge purely from a choice of coordinate system (i.e. going from Cartesian to spherical coordinates, or as part of the transfer function between different image planes in an optical system), and should therefore not be taken as fundamental.

Notice that exactly no wizardry went into any of the above expressions. As I said in the introduction, when I initially learned these back in undergrad I spent a huge amount of time puzzling over myriad explanations involving counting (possibly fudged) factors of $\cos \theta$, which were designed less to illuminate where the expressions came from than to give excuses for why they might be correct. But finding Jacobians is not hard. Neither is computing pullbacks. Moreover, working in a manifestly coordinate-free manner without using differentials as a crutch is immediately generalisable to new and interesting situations, e.g. in curved space, where the traditional pedagogical approach fails. I understand and appreciate why this stuff isn't taught exactly this way, but wow it does feel like a Lie To Children — all the more irksome given that it's done even at such an advanced level.

1. In NUS I had a lecturer who began his first lecture for a methods class by declaiming that the aim of a course should be to teach generally applicable techniques, not just “tricks of trade” — a sentiment that continues to resonate strongly with me.

2. Good readings about this kind of thing are Anna Cannas da Silva's Lectures on Symplectic Geometry and Victor Guillemin's Symplectic Techniques in Physics