Table of Contents

CM.1 Modern Classical Mechanics: Theory

CM.1.1 Functionals, Lagrangians, and Action

Newton's laws contain everything needed to do classical mechanics. In principle, we could start from them and do any classical physics problem. However, they have some difficulties. Changing reference frames (or coordinate systems) leads to complications. Accounting for all of the forces in a problem can be tedious. And Newton's laws do not easily lend themselves to the study of field theories like electromagnetism. In short, they are generally not the starting point for how physicists tackle problems. Instead, physicists use the action functional and the principle of least action. In this Chapter, we will develop the theory of classical mechanics starting from functions, the action, and Lagrangians. We will derive the Euler-Lagrange equations of motion and show that they reproduce Newton's laws. Then, we will use symmetries and Noether's theorem to analyze the equations of motion using conserved quantities. This analysis will lead us to the Hamiltonian and to formulating Poisson Brackets.

Good references for the material in this Chapter are Goldstein [1], Thornton and Marion [2], and Landau and Lifshitz [3]. For material on functionals, Richard Feynman's PhD thesis is particularly well-presented [4]. For a similar derivation of Noether's theorem to ours, see Bailin and Love [5], although it will be necessary to translate their field theory derivation to classical mechanics (as we have done here).

CM.1.1.1 What is a functional?

To use actions and Lagrangians, we must first introduce the calculus of functionals. Functionals are objects eat a function and spit out a number. There is an easy to understand them starting with regular multivariable functions. Let f(x1,x2,,xN)=f(𝐱) be an ordinary function. What it does is take in a vector 𝐱 and spit out a number f(𝐱). There is no reason why N, the dimension of the input vector, has to be finite. You could always let N=, so that now the function f associates to every sequence (x1,x2,) a number f(x1,x2,). We can rewrite the sequence elements in a provocative way as xi=x(i). Now, there is no reason why the argument of x has to be a whole number. We can make it any real number we want. Then, x is a whole function, x(t). Now, we have a function f that assigns to every function x(t) a number f(x). This function f is called a functional, and it is usually denoted f[x] with square braces to remind you that x is a whole function itself while suppressing the argument of x(t) because the value of f[x] only depends on x itself.

We now want to do calculus with functionals. We will again start with a regular multivariable function f(𝐱). The most basic calculus we can do is the linear-order Taylor expansion. Let's evaluate f at the point 𝐱+d𝐱:

f(𝐱+d𝐱)f(𝐱)+d𝐱·f(𝐱)=f(𝐱)+i=1Ndxif(𝐱)xi.(CM.1.1)

For later convenience, we will rewrite the infinitesimal as d𝐱=εΔ𝐱, where ε controls the magnitude of the term (presumed to be infinitesimally small) and Δ𝐱 can be anything (later we will see that it is usually a function). For now, just let it be a funny rewriting. In this form, the Taylor expansion becomes

f(𝐱+εΔ𝐱)f(𝐱)+i=1NεΔxif(𝐱)xi.(CM.1.2)

To generalize from a multivariable function to a functional, we let 𝐱 be a function and we use a continuous set to index x instead of the finite set 1,2,,N. That is, xix(i). The finite sum becomes a sum over a continuously infinite set, otherwise known as an integral. Therefore, the analogous expression for the Taylor expansion of a functional is

f[x+εΔx]f[x]+dtεΔx(t)f[x]x(t).(CM.1.3)

We should clarify exactly what is going on in this equation. The first term is self-explanatory, it is evaluating the functional f at the function x. The second term has replaced all indices i with functional evaluations t and summed over the t space. The quantity Δx(t) is itself a whole function of t. The derivative is odd-looking, because it has a functional evaluation in the denominator. However, it should be stressed that x(t) is to be understood the same as xi; we are taking the derivative with respect to the "coordinate" x(t).

An example is in order. Consider the functional f[x]=01dt(x(t))2, just the integral of the square of the input function x over the interval (0,1). To find the functional derivative f/x(t) for any value t, we do a Taylor expansion:

f[x+εΔx]=01dt(x(t)+εΔx(t))2=01dt(x(t))2+2x(t)εΔx(t)+ε2(Δx(t))201dt(x(t))2+01dt2x(t)εΔx(t).

We have ignored terms of second order in ε. The first term is simply f[x]. The second term is thus the one with the functional derivative. Matching the terms gives

f[x]x(t)=2x(t).

This result is not so unexpected (compare to the derivative of f(𝐱)=xi2), but it should be noted that the index t appears on each side of the equation. Make sure you understand the process here of Taylor expanding to find the derivative, since we will be doing so often in the remainder of this Chapter.

We should also note that standard terminology is to use δ instead of for the derivative of a functional. We will adopt this convention, so that the functional Taylor expansion of Equation CM.1.3 is

f[x+εΔx]f[x]+dtεΔx(t)δf[x]δx(t).(CM.1.4)

One final piece of terminology. The variation δf of the functional f is defined as the whole term that is first order in epsilon. Thus, the variation in f is

δf=f[x+εΔx]f[x]dtεΔx(t)δf[x]δx(t).(CM.1.5)

CM.1.1.2 The Principle of Least Action and the Euler-Lagrange Equations

We now want to generalize Newton's equations of motion. We will do so using the principle of least action (although it turns out the action is not necessarily minimized, just extremized). We will consider generalized coordinate functions qi(t), with generalized velocities q˙i(t). The qi could just be the normal Cartesian coordinates of some collection of particles, e.g. q3(i1)+j=[𝐫i]j, we just lay out all of the particles' coordinates and call them the various qi; (q1,q2,q3,q4,)=(x1,y1,z1,x2,). But they do not have to be the regular coordinates, the qi(t) are general. As we just demonstrated, the qi(t) do not have to all be from the same particle, but we will write them as a single vector of functions 𝐪(t). The action is a functional on these generalized coordinates,

S[𝐪]=t1t2dtL(𝐪(t),q˙(t),t),(CM.1.6)

where the function L(𝐪(t),q˙(t),t) is known as the Lagrangian. We will give a general form for the Lagrangian that reproduces Newton's laws shortly. In classical mechanics, our problem is to find the unknown solutions 𝐪(t) that extremize the action functional, subject to the boundary conditions that the 𝐪(ti) are known.

Let's find the equations of motion for the generalized coordinates 𝐪(t) from the action. We want the action to be extremized, subject to the boundary conditions. That means that its first derivatives should vanish. Let's Taylor expand the action. First, evaluate at an arbitrary nearby set of coordinates 𝐪+εΔ𝐪:

S[𝐪+εΔ𝐪]=t1t2dtL(𝐪(t)+εΔ𝐪(t),q˙(t)+εΔq˙(t),t)

We should note that q˙(t)=d(Δ𝐪)/dt is the time derivative of the arbitrary function Δ𝐪(t). We now need to Taylor expand the Lagrangian:

L(𝐪(t)+εΔ𝐪(t),q˙(t)+εΔq˙(t),t)L(𝐪(t),q˙(t),t)+i=1NεΔqi(t)Lqi+i=1NεΔq˙i(t)Lq˙i.

Therefore, the varation of the action δS resulting from changing the coordinate functions by εΔ𝐪 is

δS=S[𝐪+εΔ𝐪]S[𝐪]=i=1Nεt1t2dtΔqi(t)Lqi+i=1Nεt1t2dtΔq˙i(t)Lq˙i.(CM.1.7)

To extremize the action subject to our boundary conditions, we want its variation to be zero for any ε and any test function Δ𝐪(t) that satisfies Δ𝐪(t1,2)=0 (so that the full function 𝐪(t)+εΔ𝐪(t) satisfies the boundary conditions when 𝐪(t) does). To be able to say something when we set the variation δS=0, we now need to integrate the second term of the right-hand side of Equation CM.1.7 by parts to move the t derivative off of the arbitrary function Δ𝐪(t). Doing so gives a boundary term minus the integral with the t derivative flipped to the Lagrangian factor:

t1t2dtΔq˙i(t)Lq˙i=t1t2dtddt(Δqi(t)Lq˙i)t1t2dtΔqi(t)ddtLq˙i.

The first term of the previous line is the integral of a derivative of a function, so it simplifies to that function being evaluated at the bounds. However, we specified that Δqi(t1,2)=0 for any i in order to match the problem's boundary conditions, so the first term of our integration by parts is zero. The second term now has freed up the arbitrary function Δqi from the deriative, so we can factor it out in the variation δS. We arrive at the final form of the variation of the action:

δS=0=i=1Nεt1t2dtΔqi(t)(LqiddtLq˙i).(CM.1.8)

This variation must be zero for all ε and all Δ𝐪(t) that are zero on the boundary. The function that we choose to do the variation does not have to be continuous. Therefore, we can pick Δ𝐪(t) to be zero everywhere except for one particular component j at one particular time t0 (in math terms, we pick Δ𝐪 to be proportional to a Kronecker delta and a delta function). Then, the sum from i=1 to N collapses just to the term j, and the integral over the interval (t1,t2) collapses just to evaluation at t0. The ε drops out, so that we are left with only the term in parentheses from Equation CM.1.8, giving us the Euler-Lagrange equations:

LqiddtLq˙i=0.(CM.1.9)

We get one Euler-Lagrange equation for each generalized coordinate qi. To solve our problem, all we would need to do is solve these PDEs.

Now, our claim is that the Euler-Lagrange equations are equivalent to Newton's laws. To show this, we must pick a Lagrangian L. Our choice will be L=TV, the difference of the kinetic and potential energies. Assuming we have one particle with position x(t), velocity v(t)=x˙(t), and conservative potential energy V(x), we can now show that the Euler-Lagrange equations reproduce Newton's laws. The kinetic energy is T=mx˙2/2, so that the Lagrangian is

L(x,x˙,t)=mx˙22V(x).

We get one Euler-Lagrange equation,

LxddtLx˙=0.

We can calculate the derivatives of the Lagrangian immediately. The x derivative is L/x=V(x)=F(x), where F(x) is the force on our particle. The x˙ derivative of the Lagrangian is L/x˙=mx˙. Putting these together, the Euler-Lagrange equation states

V(x)ddt(mx˙)=0F=ma,(CM.1.10)

as claimed. We could now show that Newton's law holds for many particles using the kinetic energy of Equation CM.0.25 and the potential energy of Equation CM.0.33, but we will not do so here.

Optional: What is a theory of physics? There is much discussion about "theories of physics" in popular discourse, but what is a theory of physics? We can name some: Newtonian mechanics, special relativity, general relativity, electromagnetism, etc. But what makes them theories of physics? As we stated at the start of this Chapter, modern physics starts from a Lagrangian and then proceeds to find equations of motion and their solutions. A theory of physics can be most simply defined as a set of Lagrangians satisfying some property. For Newtonian mechanics, the Lagrangians are L=TV. For electromagnetism and special relativity, the Lagrangians have a different form, which we will cover later. In this way, the entirety of a physics problem is now defined: pick a problem and a theory of physics (Lagrangian) to analyze it with, find the Euler-Lagrange equations of motion, and solve them.

CM.1.2 Symmetries and Conserved Quantities are Related by Noether's Theorem

Real physical systems can be arbitrarily complicated, with huge numbers of generalized coordinates (or degrees of freedom) to account for. Often, however, it turns out that many of these degrees of freedom are redundant because of symmetry. We will define what we mean by a symmetry shortly. For an intuitive picture, imagine a square. Rotate it by 90 degrees. The square has not changed. Therefore, the square has a symmetry, it is symmetric under 90 degree rotation. Likewise, spheres are symmetric under any rotation in 3D space.

CM.1.2.1 Stating and Proving Noether's Theorem

Before we begin, we need to make two points. First, we have written the action functional as only depending on the generalized coordinates 𝐪. When t1 and t2 are fixed, S does only depend on 𝐪. More generally, the action is a function of the coordinates and the boundary times: S[𝐪,t1,t2]. Second, our work in the previous section can be used to write down a formula for any general variation of the action, not just one that satisfies the boundary conditions or occurs around the solution to the Euler-Lagrange equations. From Equation CM.1.7, we again integrate the Δq˙ term by parts, but now the boundary terms do not evaluate to zero. Therefore, an arbitrary variation of the action caused by a change in the generalized coordinate functions, δS=S[𝐪+εΔ𝐪,t1,t2]S[𝐪,t1,t2], is

δS=i=1Nε(Δqi(t2)L(t2)q˙iΔqi(t1)L(t1)q˙i)+i=1Nεt1t2dtΔqi(LqiddtLq˙i).(CM.1.11)

If you have any questions about this result, now is a good time to go back and re-do the derivation of the previous Section.

We can now define a symmetry. A symmetry of the action is any change (mapping) of the time variable t and/or the coordinates 𝐪 that leaves the action unchanged. The symmetry mapping takes our time variable t to some new one t, and it also takes our coordinate functions 𝐪(t) to some new ones q(t) (note the prime in two places). In particular, we are interested in continuous symmetries. These symmetries are described by (at least) one continuous parameter ε. For example, consider a translation xx+ε. The translation is a symmetry mapping that changes our coordinate x to a shifted one.

In particular, we are interested in infinitesimal symmetries of the action. They will lead to conserved quantities. To get an infinitesimal symmetry, we just take the limit that the continuous parameter describing our symmetry ε is small. Then, we can work to first order in the changes that it causes, as we did when expanding our action functional in the last Section.

Let's now write down our symmetry mapping, examine its effect on the action, and see what comes out (it will be Noether's theorem). A general symmetry mapping can change the time coordinate, say it has the effect of

tt=t+εΔt(t),(CM.1.12)

where ε is an infinitesimal constant and Δt(t) can in general be a function of t. The coordinate function 𝐪(t) can change in two ways, one from the 𝐪 part (they can become new functions) and one from the t part (the argument changes from the aforementioned t mapping):

𝐪(t)q(t)=𝐪(t)+εΔT𝐪(t),(CM.1.13)

where again ΔT𝐪(t) is a function. This ΔT𝐪(t) is the total change in the generalized coordinates, the sum of the two changes that can happen to 𝐪. For later, we will need to specify what those individual parts of the change in 𝐪 are. We can find them by evaluating the new coordinate function q(t):

q(t)q(t)+εΔt(t)q˙(t)𝐪(t)+εΔ𝐪(t)+εΔt(t)q˙(t).(CM.1.14)

We should explain the previous equation. First, we Taylor expanded q(t) around the point t to find the t contribution to the total change in the coordinate. Then, we defined the coordinate-only change as q(t)=𝐪(t)+εΔ𝐪(t). Accordingly, since the q term, the t symmetry mapping contribution, is already proportional to ε, the prime on the coordinate disappears. Therefore, we have found that the coordinate-only contribution to the total change in the generalized coordinates under the symmetry map is

Δ𝐪(t)=ΔT𝐪(t)Δt(t)q˙(t).(CM.1.15)

Again, at this moment it is unclear why exactly we would need this expression for the coordinate-only change under the symmetry mapping, but it will become clear momentarily.

Now that we know how our time and generalized coordinates change under the symmetry mapping, let's see what happens to the action. Under the symmetry, only the arguments of the action change. That is, we now want to calculate S[q,t1,t2]. The new action under the effect of the symmetry is

S[q,t1,t2]=t1t2dtL(q(t),q˙(t),t)=t1+εΔt(t1)t2+εΔt(t2)dtL(q(t),q˙(t),t)(CM.1.16)

We now want to Taylor expand the new action about the old pre-symmetry points. We will do so in two steps. First, we know that we can break the integral up into a sum of three integrals by manipulating the bounds accordingly:

S[q,t1,t2]=t2t2+εΔt(t2)dtL(q(t),q˙(t),t)+t1t2dtL(q(t),q˙(t),t)+t1+εΔt(t1)t1dtL(q(t),q˙(t),t).

Since ε is small, we can approximate the first and last integral by their Riemann sums. That is, tt+εxdtf(t)εxf(t). If the ε part is on the bottom bound, then we first switch the bounds, picking up a minus sign, then approximate. Since these terms will already be proportional to ε, we can drop all of the other primes in them. Therefore, we find

S[q,t1,t2]εΔt(t2)L(𝐪(t2),q˙(t2),t2)εΔt(t1)L(𝐪(t1),q˙(t1),t1)+t1t2dtL(q(t),q˙(t),t).

We have one integral remaining. Upon changing the dummy variable (that is integrated over, so we can just rename it) from t to t, we see that this final integral we have to evaluate is equal to S[q,t1,t2], which is just the action evaluated using the coordinate-only change of the symmetry, q(t)=𝐪(t)+εΔ𝐪(t). Now it should be clear why Equation CM.1.15 was necessary. To evaluate the final integral of our symmetry changed action, we have to evaluate the variation of the action under the coordinate-only change that our symmetry causes. To do so, we must use Equation CM.1.11, where the Δ𝐪(t) there is equal to the coordinate-only change, which we have conveniently also called Δ𝐪(t). The full change in our action under a symmetry is then

S[q,t1,t2]S[𝐪,t1,t2]=εΔt(t2)L(𝐪(t2),q˙(t2),t2)εΔt(t1)L(𝐪(t1),q˙(t1),t1)+(CM.1.17)

+i=1Nε(Δqi(t2)L(t2)q˙iΔqi(t1)L(t1)q˙i)+i=1Nεt1t2dtΔqi(LqiddtLq˙i).

Equation CM.1.17 holds for any generalized coordinate functions 𝐪 used as our comparison point for evaluating the action's change under the symmetry mapping. Now, we will specialize to the case where 𝐪 is the classical path, the solution to the Euler-Lagrange equations (in terminology that you may know, we are now specifying to the "on-shell" behavior, which just means the Euler-Lagrange equations hold). Then, the final term of Equation CM.1.17 is zero. We can also use Equation CM.1.15 to plug in for the Δqi in terms of the total change in the coordinate functions ΔTqi and the change in the time Δt. Our final answer is that the variation of the action δS=S[q,t1,t2]S[𝐪,t1,t2] under the symmetry, evaluated on the solution to the equations of motion, is

δS=εΔt(t2)(L(𝐪(t2),q˙(t2),t2)i=1Nq˙i(t2)L(t2)q˙i)εΔt(t1)(L(𝐪(t1),q˙(t1),t1)i=1Nq˙i(t1)L(t1)q˙i)+

+εi=1NΔTqi(t2)L(t2)q˙iεi=1NΔTqi(t1)L(t1)q˙i.(CM.1.18)

This equation holds for any t1 and t2. When the symmetry leaves the action invariant, δS=0, so that the quantity

Q(t)=Δt(t)(L(𝐪(t),q˙(t),t)i=1Nq˙i(t)L(t)q˙i)+i=1NΔTqi(t)L(t)q˙i(CM.1.19)

is conserved along the classical path (it is the same for any time, Q(t)=Q, or dQ/dt=0). We have just proved Noether's theorem, which states that a symmetry of the action has a corresponding conserved quantity along the path that is the solution to that action's Euler-Lagrange equations.

CM.1.2.2 Example Applications of Noether's Theorem

We can use Noether's theorem to derive the conservation laws of the previous Chapter. First, we need to write down the Lagrangian for a system of N particles. We simply use the kinetic energy, from Equation CM.0.25, and the potential energy, from Equation CM.0.33:

L=TV=imir˙i22iVi(𝐫i)12i,jVij(rij),(CM.1.20)

where our 3N generalized coordinates are the individual coordinates of each particle, qi(t)=rn(k)(t) for some i (generalized coordinate index), n (particle index), and k (x, y, or z Cartesian coordinate index), and rij is the magnitude of the separation vector between particles i and j.

Momentum Conservation. For momentum conservation, consider the symmetry of translation, rn(k)rn(k)+ε for a constant ε, for all particles n, but for a specific coordinate k. For example, we could translate all coordinates in the x direction. The time does not change in this symmetry, so the total change in the generalized coordinate is either ε (if that generalized coordinate is along the translation direction) or zero (if it is perpendicular to the translation direction): εΔTri(k)=ε. If the external potentials Vi(𝐫i) all do not depend on the k components of the particle positions, i.e. if Vi/ri(k)=0, then since the velocity of a particle is constant under translation, and since the translations do not change the separation vectors between particles, then the Lagrangian is unchanged by this symmetry. Since the Lagrangian is unchanged and the time does not change, the action is unchanged under the symmetry of translation. Therefore, we have a conserved quantity. From Equation CM.1.19, we find that

i=13NΔTqiLq˙i=i=1NLr˙i(k)=i=1Nmir˙i(k)=P(k)(CM.1.21)

is the conserved quantity, where we have evaluated L/r˙i(k) to find that it is just the k component of the momentum of particle i. Therefore, the conserved quantity is the total momentum in the k direction. So, Noether's theorem says that if the action is invariant under the symmetry of translation in the k direction, the total momentum in the k direction is conserved, it is a constant. The condition for having this symmetry was that Vi/ri(k)=0, which, because the external force is the gradient of the external potential, is equivalent to requiring the total external force in the k direction to be zero, exactly as we found in the previous Chapter.

Angular Momentum Conservation. For angular momentum to be conserved, we must examine rotational symmetry. Without loss of generality, we can call the axis that we are rotation about the z axis. The time does not change when we do a rotation. For an infinitesimal rotation of angle ε about the z axis, any vector 𝐫=(x,y,z) transforms as

𝐫r=(cosεsinε0sinεcosε0001)(xyz)=(xcosεysinεxsinε+ycosεz)(xεyy+εxz).(CM.1.22)

To find the change in the vector 𝐫, we subtract it from the rotated vector, giving

ΔT𝐫=r𝐫=(εyεx0)=εz^×𝐫,(CM.1.23)

where we have recognized the cross product between the z unit vector and the vector 𝐫. Now, we know that rotations do not change the lengths of vectors. To see this property mathematically, we can compute the length of r directly, to first order in ε:

r2r2+2ε𝐫·(z^×𝐫)=r2,(CM.1.24)

where the ε term is zero because the cross product is perpendicular to both vectors in it (thus is perpendicular to 𝐫 and has zero dot product with it).

Since the lengths of vectors are invariant under rotations, the kinetic (which depends on the magnitude of the velocity) and pairwise potential (which depends on the magnitudes of the separations between particles) energies are necessarily invariant under a rotation. If we can find the conditions under which the external potential is invariant under rotations, then we will know when a Lagrangian is invariant under rotation, meaning the action has rotational symmetry. Consider a single potential, V(x,y,z). Under a rotation, it changes to

V(x,y,z)V(xεy,y+εx,z)V(x,y,z)ε(yVx+xVy).(CM.1.25)

We can again notice again the z component of a cross product, so that the change in the potential is proportional to

(yVx+xVy)=z^·(𝐫×V),(CM.1.26)

which is just the negative of the z component of an external torque. If we want the external potential energy to be invariant, then all z components of all external torques must be zero, or the total external torque in the z direction must be zero. Then, the action has rotational symmetry, leading to the conserved quantity

i=13NΔTqiLq˙i=i=1Nyipi(x)+xipi(y)=L(z),(CM.1.27)

the z component of the angular momentum.

As a final point, note that the potential V(x2+y2,z) is rotationally invariant, since the x2+y2 is the length of a vector and z does not change for rotations about the z axis. Therefore, we can easily spot rotationally symmetric actions by finding external potential energies of this form.

Energy Conservation. Consider now a translation in time. For this time translation, we will take tt+ε. We will also choose ΔT𝐪(t)=0. We are making a non-trivial choice here. The t transformation contributes εq˙(t) to the total transformation of the coordinates. So, we are choosing that Δ𝐪(t)=εq˙(t) so that the total, combined symmetry transformation of the generalized coordinates is zero. How does the action change under this symmetry? We can plug in:

S[q,t1,t2]=t1+εt2+εdtL(q(t),q˙(t),t)=t1+εt2+εdtL(𝐪(tε),q˙(tε),t).(CM.1.28)

The last equality follows from q(t)=𝐪(t)=𝐪(tε). If we now specialize to the case where the Lagrangian is not an explicit function of time, L/t=0, and we change coordinates in the integral from t to t=tε, then we find

S[q,t1,t2]=t1+εt2+εdtL(𝐪(tε),q˙(tε))=t1t2dtL(𝐪(t),q˙(t))=S[𝐪,t1,t2].(CM.1.29)

Therefore, our action is invariant under time translations, so that time translation is a symmetry of the action. The corresponding conserved quantity is

Δt(t)(L(𝐪(t),q˙(t))i=1Nq˙i(t)L(t)q˙i)=L(𝐪(t),q˙(t),t)i=1Nq˙i(t)L(t)q˙i.(CM.1.30)

We are free to change the sign on the conserved quantity. Let's define the Hamiltonian H, a conserved quantity, as

H=i=1Nq˙i(t)(L(t)q˙i)L(𝐪(t),q˙(t)).(CM.1.31)

What is H? We can use the fact that L(t)/r˙i(k)=mr˙i(k) to recognize that the first term is just 2T. Therefore, the Hamiltonian is

H=2TT+V=T+V=E,(CM.1.32)

just the total energy as we defined it in the previous Chapter. The total energy is conserved when the Lagrangian is not an explicit function of time. This result matches what we found in the previous Chapter, since here we presupposed that the forces were conservative (giving potential energy V) and we did not deal with any time dependence in the previous Chapter.

Optional: Symmetries More Generally. Above, we defined a symmetry as being a mapping that keeps the action the same. The action is not the only quantity that can be left unchanged under a symmetry transformation. The Lagrangian could be left invariant, or even the equations of motion themselves could be unchanged under the symmetry transformation. If we do not want to be as strict as requiring no change, we could allow something like the Lagrangian changing by the total time derivative of a function, LL+dA/dt. When the Lagrangian changes in this way under a mapping of the coordinates and time, we say that that mapping is a quasi-symmetry of the system. It turns out that quasi-symmetries can also lead to conserved quantities when the equations of motion are also satisfied.

We can take one example to illustrate this point. Consider a Lagrangian L(𝐪(t),q˙(t)) that does not depend explicitly on time. Then, the symmetry transformation tt+ε, 𝐪(t)𝐪(t+ε)𝐪(t)+εq˙(t) will be a quasi-symmetry. Note that this symmetry transformation is different from the time translation that we encountered above, because the generalized coordinates change (i.e. we did not choose to cancel the t part of their mapping using a mapping of the coordinates). Treated as a function of time (switching coordinates from (𝐪,q˙,t) to just t), the Lagrangian changes as L(t)L(t+ε)L(t)+εdL/dt, so that the difference is a total time derivative. We know what the time derivative of L is using the chain rule (i.e. switching back to the coordinates (𝐪,q˙,t) from just t)

dLdt=i=1Nq˙i(t)Lqi+i=1NLq˙iddtq˙i(t).

Using the Euler-Lagrange equations, Equation CM.1.9, we can rewrite the first term, so that we can simplify using the product rule:

dLdt=i=1Nq˙i(t)ddtLq˙i+i=1NLq˙iddtq˙i(t)=ddt(i=1Nq˙i(t)Lq˙i).

Altogether, we can move one of the terms over to the other side to find that the time derivative of some quantity is zero. That quantity is the Hamiltonian H that we have already encountered, so that this quasi-symmetry of the Lagrangian leads to conservation of the Hamiltonian:

0=dHdt=ddt(i=1Nq˙i(t)Lq˙iL).(CM.1.33)

CM.1.3 The Hamiltonian Formulation, Hamilton's Equations, and Poisson Brackets

CM.1.3.1 The Hamiltonian and Hamilton's Equations

The Hamiltonian is an important quantity, and it can be used to formulate classical mechanics like the Lagrangian can. We can find equations of motion that are equivalent to the Euler-Lagrange equations. First, let's write out the definition of the Hamiltonian for a general Lagrangian, not necessarily just one that does not depend explicitly on time:

H=i=1Nq˙i(t)(L(t)q˙i)L(𝐪(t),q˙(t),t).

The quantity L/q˙i has appeared a handful of times now. In Equations CM.1.21 and CM.1.27, we saw that it was equal to the usual linear momentum associated with coordinate qi. Because of this association, we will now define the generalized momentum pi for the generalized coordinate qi as

pi(𝐪(t),q˙(t),t)=Lq˙i.(CM.1.34)

As a derivative of the Lagrangian, it can depend fully on each generalized coordinate, each generalized velocity, and time. Using this definition, we can rewrite the Hamiltonian as

H(𝐪(t),q˙(t),t)=i=1Nq˙i(t)pi(𝐪(t),q˙(t),t)L(𝐪(t),q˙(t),t).(CM.1.35)

Now we will do a trick that we will make use of again in the future when studying thermodynamics: We will take the differential of H and see a surprising result. Our coordinates for the differential are 𝐪, q˙, and t, although for now we will not evaluate dpi either. Using the product rule, we see that

dH=i=1N(pidq˙i+q˙idpi)dL.

Using the chain rule on dL expands it in terms of the differentials dqi, dq˙i, and dt:

dH=i=1N(pidq˙iq˙idpiLqidqiLq˙idq˙iLtdt)=i=1N[q˙idpiLqidqi+(piLq˙i)dq˙i]Ltdt.

We can notice that the coefficient of dq˙i is zero by the definition of the generalized momentum, Equation CM.1.34. Therefore, we find that dH does not depend on dq˙i at all. Instead, it depends on dpi:

dH=i=1N(Lqidqi+q˙idpi)Ltdt.(CM.1.36)

The differential of a function tells you what variables a function is really a function of. For example, if f(𝐱) is some generic multivariable function, then df=f/xidxi. We have therefore found that the Hamiltonian can be thought of as a function of 𝐩, replacing q˙ as an independent variable, and that its derivatives can be read off from

dH=i=1N(Hqidqi+Hpidpi)+Htdt=i=1N(Lqidqi+q˙idpi)Ltdt.

There are some subtleties about when the coordinate change (𝐪,q˙,t)(𝐪,𝐩,t) is well-defined and invertible, but we will not cover them here. The coordinate space (𝐪,𝐩) is called the phase space of the system; as we will see, it finds many applications as the natural setting for quantum and statistical mechanics.

When the Euler-Lagrange equations, Equation CM.1.9, hold, we can substitute L/qi=dpi/dt to find Hamilton's equations:

Hqi=dpidt,(CM.1.37)

Hpi=dqidt,(CM.1.38)

Ht=Lt.(CM.1.39)

Hamilton's equations are the equations of motion for classical mechanics in the Hamiltonian formulation. The first two are equivalent to the Euler-Lagrange equations. The Euler-Lagrange equations are second order differential equations, and if there are 3N generalized coordinates then there are 3N Euler-Lagrange equations. Hamilton's equations are first order differential equations, but there are correspondingly 6N equations, where the new equations come from the definition of the pi (the pi only appear with first derivatives in Hamilton's equations, but plugging back in the q˙i restores the second order time derivatives).

CM.1.3.2 Poisson Brackets

Consider an arbitrary function A(𝐪(t),𝐩(t),t) on phase space. Let's calculate its total time derivative using the chain rule:

dAdt=At+i=1N(Aqiq˙i+Apip˙i)=At+i=1N(AqiHpiApiHqi),(CM.1.40)

where we have used Hamilton's equations to eliminate the time derivatives of the generalized coordinates and the generalized momenta. We have found that the time derivative of A depends on how A explicitly depends on time and some combination of derivatives of A with respect to the phase space coordinates multiplied by derivatives of the Hamiltonian. The appearance of the Hamiltonian, along with the fact that H is conserved when the action is time translation symmetric, tells us that the Hamiltonian is deeply involved in the time evolution of our system. We now define the Poisson Bracket {A,B} of two phase space functions A and B as

{A,B}=i=1N(AqiBpiBqiApi).(CM.1.41)

Using the Poisson Bracket, we can rewrite Equation CM.1.40 for the total time derivative of a phase space function as

dAdt=At+{A,H}.(CM.1.42)

We can also rewrite Hamilton's equations with the Poisson Bracket as well. If we take A(𝐪,𝐩,t)=qi or pi, then it is clear that there is no explicit time dependence. Therefore, dqi/dt={qi,H} and dpi/dt={pi,H}, so that

Hqi={pi,H},(CM.1.43)

Hpi={qi,H}.(CM.1.44)

If a quantity A(𝐪,𝐩) does not explicitly depend on time, then it is time derivative is dA/dt={A,H}. Therefore, the value of A is a constant of the motion, a conserved quantity, if it does not explicitly depend on time and if

{A,H}=0.(CM.1.45)

We now have a simple way of checking for a conserved quantity, just finding the Poisson Bracket of the quantity with the Hamiltonian and ensuring that the quantity does not explicitly depend on time.

We will close out this section by calculating some Poisson Brackets. First, we will do the all-important {qk,pj}:

{qk,pj}=i=1N(qkqipjpipjqiqkpi)=i=1Nδkiδji=δkj,(CM.1.46)

where the partial derivatives are zero unless the numerator and denominator are the same variable, in which case the derivative is one (hence the use of the Kronecker delta). In addition, the Poisson Bracket of a generalized coordinate with another generalized coordinate, or of a generalized momentum with another generalized momentum, should be zero because of the partial derivatives with the opposite type (momentum or coordinate, respectively) of variable.

Next, we will specify to one particle, so that qi=ri. The angular momentum of the particle is 𝐋=𝐫×𝐩, so that the x component is Lx=ypzzpy, the y component is Ly=zpxxpz, and the z component is Lz=xpyypx. The Poisson Bracket of Lx with Ly is

{Lx,Ly}=i=x,y,z(LxriLypiLyriLxpi).(CM.1.47)

The only terms that contribute are ones from the i=z term of the sum, since Lx does not depend on x or px and Ly does not depend on y or py. We can calculate Lx/z=py, Ly/pz=x, Lx/pz=y, and Ly/z=px, so that the Poisson Bracket is

{Lx,Ly}=xpyypx=Lz,(CM.1.48)

just Lz. By renaming coordinates (which changes nothing about the system, as long as we keep the coordinate system right-handed; to do so, turn the x or y axis into the z axis, then label the other two x or y as appropriate for a right-handed coordinate system), we can find the other Poisson Brackets of the angular momenta with each other:

{Ly,Lz}=Lx,(CM.1.49)

{Lz,Lx}=Ly.(CM.1.50)

Optional: Canonical Transformations and Other Properties of Poisson Brackets.

We picked the generalized coordinates 𝐪 and 𝐩. How do we know that our choice does not affect the equations of motion, either the Euler-Lagrange or Hamilton's equations? We can show that the Poisson Bracket does not change if we use different coordinates 𝐐(𝐪,𝐩) and their 𝐏(𝐪,𝐩) as long as these coordinates satisfy the Poisson Brackets {Qi,Pk}=δik and {Qi,Qk}={Pi,Pk}=0 (all Poisson Brackets here are calculated using the old coordinates 𝐪 and 𝐩). Such a transformation of coordinates that preserves the Poisson Brackets between the generalized coordinates and generalized momenta is known as a canonical transformation.

We can think of the old coordinates as multivariable functions of the new ones: 𝐪(𝐐,𝐏) and 𝐩(𝐐,𝐏). First, the Poisson Bracket of two quantities A and B is

{A,B}=i=1N(AqiBpiBqiApi).

Using the chain rule on the pi derivative of B in the first term and the qi derivative of B in the second term gives

{A,B}=i,j[Aqi(BQjQjpi+BPjPjpi)(BQjQjqi+BPjPjqi)Api].

We can group together the terms with the derivatives of B with respect to the same new coordinates. For example, B/Qj multiplies derivatives of A and Qj with respect to qi and pi. The combination is exactly the Poisson Bracket of A with Qj, when we incorporate the sum over i:

{A,B}=j(BQj{A,Qj}+BPj{A,Pj}).(CM.1.51)

We now need to evaluate {A,Qj} and {A,Pj}. To do so, we can actually use Equation CM.1.51 that we just derived. Using that formula, we can calculate {Qj,A}:

{Qj,A}=i(AQi{Qj,Qi}+APi{Qj,Pi})=APj,(CM.1.52)

where we used the fact that the Poisson Bracket of anything with itself is zero, and the fact that the canonical transformation definition guaranteed {Qj,Pi}=δji. We now know that {A,Qj}={Qj,A}=A/Pj. Similarly, evaluating {Pj,A} gives

{A,Pj}={Pj,A}=i(AQi{Pj,Qi}+APi{Pj,Pi})=AQj.

Finally, we can put these results into Equation CM.1.51 to see that

{A,B}=j(APjBQj+BPjAQj)={A,B}Q,P,(CM.1.53)

or the Poisson Bracket evaluated using either set of coordinates and momenta gives the same answer.

Preservation under canonical transformation is only one property of Poisson Brackets. We could also prove a few more which are mathematically interesting and can lead to further theoretical insights; since it just takes algebra and calculus to do so, we will not do it here. We will just state the results. Poisson Brackets anti-commute, that is they are anti-symmetric in the inputs:

{A,B}={B,A}.(CM.1.54)

They are bilinear, which means that they pass through constants and addition:

{aA+bB,C}=a{A,C}+b{B,C}.(CM.1.55)

They satisfy the product rule, or Leibniz's rule:

{AB,C}={A,C}B+A{B,C}.(CM.1.56)

And finally, they satisfy the Jacobi identity:

{A,{B,C}}+{B,{C,A}}+{C,{A,B}}=0.(CM.1.57)

CM.1.4 References

  1. H. Goldstein. Classical Mechanics. Addison-Wesley series in advanced physics. Addison-Wesley Press, 1950.
  2. S.T. Thornton and J.B. Marion. Classical Dynamics of Particles and Systems. Brooks/Cole, 2004.
  3. L.D. Landau and E.M. Lifshitz. Mechanics. Elsevier Science, 1982.
  4. Richard Phillips Feynman and Laurie M Brown. Feynman's thesis: a new approach to quantum theory. World Scientific, 2005.
  5. D. Bailin and A. Love. Introduction to Gauge Field Theory, Revised Edition. Taylor and Francis, 1993.

Table of Contents

This is a teXsite.