# Introduction

# Introduction

# Abstract and Keywords

The chapter explains how the Principle of Least Action yields a unique answer to a physical problem irrespective of the frame of reference. The motivation arises from d’Alembert’s rousing words: “To someone who could grasp the Universe from a unified standpoint the entire creation would appear as a unique truth and necessity.” The requirement is that one “algorithm” can cope with all the specificity, variety, and complexity across the whole of physics. That one algorithm could ever be up to the task is made plausible by use of an allegory involving a King, the princess, and some suitors. Finally, the link with Least Action is made. Note that the terms extremal, objectivity, observer, and viewpoint are explained.

*Keywords:*
extremal, algorithm, objectivity, reference frame, observer, d’Alembert, necessity, unique truth, viewpoint, standpoint

It would be wonderful if there was one principle, simple to state, that could account for every process in the physical universe. But there *is* such a principle, a surprisingly well-kept secret, that accounts for *almost* every physical process. It is a principle that is more powerful than Newton’s ‘$\mathbf{\text{F}}=m\mathbf{\text{a}}$’, and a principle that doesn’t have energy conservation as a requirement in every scenario. We know that Newtonian Mechanics must be replaced when speeds are very high, or the masses are tiny, or huge - but this ‘new’ principle still applies in these extreme regimes. How can one principle explain so much? The clue comes from the deep wisdom of the eighteenth-century French *philosophe*, Jean le Rond d’Alembert:^{1}

- “L’univers, pour qui saurait l’embrasser d’un seul point de vue,
- ne serait, s’il est permis de le dire,
- qu’un fait unique et une grande vérité.”

- (“If one could grasp the whole Universe from one viewpoint,
- it would appear, if it is permitted to say this,
- as a unique fact and a great truth.”)

No one knows whether d’Alembert’s beautiful claim is correct but one thing is certain, if we cannot find one universal viewpoint then we will not arrive at one universal truth. For our viewpoint to be universal it is not a question of us all looking at the view from the same hilltop, rather, it is a requirement that all viewpoints are equivalent, and that there is just *one* universal rule or law or algorithm that solves the problem. Our new principle achieves this - but it seems incredible that one simple ‘algorithm’ could cope with all the specificity, variety, and complexity across the whole of physics. To make this plausible, we consider the following fable.

There once was a King and he set a fiendish puzzle for prospective suitors who wanted to marry his daughter, the beautiful princess. The King had constructed a maze and the successful suitor had to provide the princess with instructions for collecting treasure from a casket. The young suitors had a few hours to look at a map of the maze and prepare their instructions. The King then looked at their answers and quickly whittled away the number of competitors to just two. These two suitors, Alfredo and Bruno, had very different approaches to the puzzle. Alfredo provided detailed instructions for every route, advising the princess to sight the tall column, visible from a distance over the hedges, to let her nose guide her to the fragrance of the frangipani tree, and also to listen out for the sound of bells chiming in the bell-tower, and water splashing at the fountain. Bruno came forward with just a tiny scrap of paper on which it said, “Wherever you may find yourself, turn left at the next intersection. Eventually you will reach the casket.” (The King had assured the contestants that there were no disconnected ‘islands’ within the maze.)

The King, a veritable sage, awarded the hand of his daughter in marriage to this second suitor.

The amazing thing about this suitor’s instructions is that they are very simple to state and they are *universal* - they apply to any maze (although the maze must satisfy certain geometrical restrictions, for example, it cannot contain disconnected islands). There is also another
(p.3)
curious attribute of Bruno’s algorithm - it is *local*. The princess need only ever look as far ahead as the very next step (while always keeping an eye open for the casket). Although Alfredo’s instructions may be broken down into steps the method is not local in the true sense (there are references to distant features - the sound of the fountain, the smell of the frangipani, the sight and sound of the bells in the bell-tower).

This is only a story but it demonstrates some essential points. In our new principle the method is truly local, and this is never the case for Newtonian Mechanics, even when a path may be broken down into lots of tiny incremental steps. But most astounding of all we shall find that the ensuing equations are *invariant*, taking exactly the same form, no matter what the scenario or what coordinates are used. It is not even necessary that the setting is time-independent, or that the components are passive (so the maze could writhe and undulate with time, and the princess could affect the maze, say, by trimming the hedges as she passed by). The reason for this invariance is that we have at last found something absolute: it is not a universal timepiece, yardstick, or reference frame, for there are none, it is a *principle*, and one that applies across almost every area of physics. We introduce it by way of a brief historical aside.

One of the most awe-inspiring developments in physics has been the shift from Newton’s to Einstein’s view of gravity. In Newton’s Theory of Universal Gravitation, gravity is a force acting between bodies, however near or far. In Einstein’s Theory of Gravitation (the Theory of General Relativity), the force is completely dispensed with. It is replaced by a patchwork of reference frames, sufficiently small, but seamlessly joined together, and, instead of responding to a force, the orbiting body now responds to *geometry* - the ‘curvature’ of ‘space’. This is often represented heuristically by the image of, say, the Earth resting on a large two-dimensional surface, such as a trampoline, distorting this surface, and thereby affecting the trajectory of nearby small bodies, such as the Moon. These two theories - Newton’s and Einstein’s - are utterly different, and yet, amazingly, the experimental differences, for example, the predictions of the Moon’s orbit, are practically nil. It turns out that Einstein’s approach involves much more complicated calculations and, as we have just stated, barely any practical advantage - so why use it? The answer is that it has deeper explanatory power, it is applicable over a much greater range of problems, and it is philosophically more sound.

(p.4) We have been talking just of gravitation, but the principle that we introduce explains not only gravitation but all kinds of problems in the physical sciences (statics and dynamics, optics, electricity and magnetism, quantum mechanics, physical chemistry, statistical mechanics, astronomy, materials science, hydrodynamics, quantum electrodynamics (QED), and so on); it also has deeper explanatory power, is applicable over a much greater range of problems, and is philosophically more sound. This principle is the Principle of Stationary Action (the PSA). It can be stated as:

The Principle of Stationary Action

The physical system seeks out the ‘flattest’ region of ‘space’.

This is equivalent to choosing the ‘straightest’ possible path, which (usually) translates as the path of *least* ‘distance’. One more thing, whether considering the ‘flatness of space’ or the ‘straightness of paths’, or the ‘least distance’, only the ‘space’ *nearby* - that is to say, locally, - needs to be inspected.

The subtitle of this book is the Principle of *Least* Action (PLA), but the principle just given is the Principle of *Stationary* Action (PSA)? ‘Stationary’ is a mathematical term meaning ‘at a flat point of ‘space’’ but whether that flat point implies a least path requires a further investigation - therefore the PSA is the more general principle, and incorporates the PLA. However, we shall find that the more stringent condition, the PLA, *is* the one we need, and later on we’ll switch to calling our principle: the Principle of Least Action. We’ll explain this in a later chapter (Section 6.6, Chapter 6).

Einstein’s Theory of Special Relativity, that preceded his theory of General Relativity, starts with two postulates: 1) the laws of physics take the same form in every reference frame,^{2} 2) the speed of light in a vacuum is a constant. These two postulates are strikingly different: the first postulate (the Principle of Relativity) is philosophical in character - Einstein coached us into realizing that physics just couldn’t be practised unless postulate 1) applied. On the other hand, postulate 2) appears to be empirical - the speed of light is constant, yes, but perhaps, in another Universe, it might have been variable? Similarly, in the case
(p.5)
of the Principle of Stationary Action, the Principle has two postulates of very different natures. The first postulate, 1), we have already described - that the system ‘space’ is as ‘flat’ as possible, locally. This appears reasonable but rather abstract and philosophical; it sounds more like geometry than physics, and we need to know - ‘flat’ with respect to what? This is where the second postulate comes in, the one that contains the physical input. Postulate 2) states that what is actually being flattened is a certain specific physical quantity - ‘*action*’. This quantity has dimensions of *energy* × *time*, or *linear momentum* × *distance* or *angular momentum* × *angle*, and so on. In one of its first incarnations, ‘action’ was given as ‘$m\times v\times ds$’, where *m* is the mass of a ‘free’ particle, *v* is its speed, and *ds* is a small distance along the particle’s path. As we have to do with a postulate, we cannot justify the choice by deduction from even more elemental principles. Nevertheless, ‘action’ does seem like a worthy candidate for a telling physical quantity - it is a scalar (a pure magnitude, having no direction - therefore more likely to be an invariant), it ‘spans the physical space’ (nothing crucial is missed out), and it does so in the simplest way possible (*mvds* is postulated rather than, say, ${m}^{2}{v}^{3}{d}^{4}s/d{t}^{4}$).

D’Alembert’s “one viewpoint” implies objectivity, and this is difficult to arrive at in everyday life where prejudice abounds. For example: we barely notice the reaction-force against the soles of our feet, or on our bottoms, that is present almost every minute of our lives (Einstein teaches us that we are thereby not in ‘free-fall’, and so do not serve as a ‘natural’ frame of reference); on the other hand, in the rapidly rotating ‘gravitron’ at the funfair, we feel pinned as if by a great weight but have no sensation of our spinning motion (we merely notice that we can barely nod or move our arms). When revisiting a park that we knew as a child, we find that it resembles a pocket handkerchief rather than a vast estate - is the slight increase in the height of our eyes the source of this change? No, it arises because we have undergone an enormous (non-local) translation in time, during which our brain has totally altered. We watch the water sloshing about in a neighbour’s swimming pool - perhaps they have installed a wave-generating machine? Upon closer inspection we find no machine, but realize that ‘a giant hand’ - an Earth tremor - is gently rocking the pool: therefore our initial assumption of an isolated system, defined by the edges of the pool, is wrong.

What helps us to achieve objectivity in physics (as opposed to everyday life) is the fact that we are bound by the strictures of mathematical tests. The PSA is centred on a mathematical test - a ‘test of the (p.6) flatness of ‘space’’ - which is remarkable for its ability to winnow away the distracting observer-dependent features and so arrive at the true invariant laws. It succeeds in this because it involves an ‘extremal’ feature of the mathematical landscape (something like ‘shortest route between Peshawar and Kabul’), and these features are unique in that they don’t depend upon the type of map or even the units used (the route is shortest whether we use a Mercator’s or a Peter’s projection, and whether we measure in feet or in metres). Before the test can be applied, we have to define what we mean by ‘space’. We follow the historical development, and return to our discussion of Newton.

You most likely know Newton’s Laws of Motion^{3} but we want now to give a different perspective of them, emphasizing the philosophical assumptions. An implicit premise of Newton’s Mechanics was the outstanding advance: ‘space’ and ‘physics’ are totally separate from each other. ‘Physics’ means forces, masses, and masses in motion. ‘Space’, following Descartes’ invention of the coordinate system, means the three everyday space dimensions (commonly designated *x*, *y*, and *z*), and the time, *t*. All of *x*, *y*, *z*, and *t* are assumed independent of each other, and each goes on to infinity. Gone are the sixteenth century’s tendencies, empathies, abhorrences, and vortices; and Newton’s ‘space’ is empty, not full like Descartes’. (It is a void, which Newton does not abhor.)

Next after ‘space’ come particles - bodies with no internal structure but having an intrinsic property, mass. By Law (Newton’s First), each ‘free’ particle is either at rest, or moves at constant speed and in a constant direction.

Finally, forces, **F**, are introduced, such as an attractive force between one particle and another. A force has one effect and one effect only - it causes a particle to accelerate. This is where mass plays its role, it determines how big the acceleration shall be, for a given force. (Apart from this, mass is inert - it doesn’t depend on when or where the particle is, or on its state of motion.) All this is asserted in the Second Law, $\mathbf{\text{F}}=m\mathbf{\text{a}}$.^{4} Another outstanding hypothesis was that for composite bodies (bodies made from many particles), or indeed for any complicated arrangement of particles, the net outcome could be obtained by ‘summing’ over the
(p.7)
influence of each particle considered on its own.^{5} Thus was born the idea of the ‘rigid body’, an extended body made up from separate particles, but which could itself be treated as if it were one single particle, with all its mass concentrated at one point.

325 years on from Newton’s “*Principia*”^{6} it is hard to remain sufficiently impressed. As the philosopher, Schopenhauer, said, a theory passes from being rejected as ridiculous to being accepted but taken to be obvious.^{7} Consider Newton’s use of ‘acceleration’. It was already known, from Galileo’s Principle of Relativity, that uniform motion is relative to the observer. Newton turned this around: all non-uniform, that is to say, *accelerated*, motion is *not* relative to the observer, it *can* be known absolutely. Acceleration with respect to what? Answer, with respect to ‘space’. But as the accelerations are absolute, then Newton’s ‘space’ is absolute. So we have arrived back at Newton’s wonderful abstraction, an infinite ‘space’, an inert and absolute background to physical happenings.

With the PSA, every tenet of Newton’s Mechanics is challenged: the absolutes of Newton are avoided as every measure - whether it be a position, a speed, a direction, a time, and so on - is always defined with reference to something *within* the given system; action-at-a-distance does not occur - a global picture is built up, piece by piece, but by antennae which are sensitive only to conditions locally; the axes of ‘space’ no longer extend to infinity, are not necessarily independent of each other, and not necessarily independent of the masses within; and Newton’s modular approach, building up complexity from more and more complicated arrangements of particles, each taken singly, is replaced by a holistic ‘systems’ approach.

Let’s explain this ‘systems’ approach by analogy. Bertrand Russell (philosopher and mathematician) quipped that the activities of mankind amounted to the redistribution of matter within $\pm 0.2\mathrm{\%}$ of the Earth, at its surface (this was before the era of space travel). To check Russell’s claim, we could exhaustively track the motion of every single person, throughout recorded history, and note what masses they were carrying and where they deposited them; or, we could estimate the (p.8) matter-content of cities built, crops grown, monuments constructed, bodies buried, and so on. In the second, ‘systems’, approach, we have lost the simplicity of basic elements (a person, their movements, what they are carrying) and instead have more abstract concepts relating to the whole system (cities, roads, pyramids, etc.). Some totally new possibilities arise (‘the deceased’) that were not catered for (!) in the first approach. We end up with a static tally of the mass distribution.

Another, more dynamic, example is given by the description of a football match. In the modular description we have only ‘players’ and ‘a football’; in the other approach, the ‘*systems*-view’, we have ‘defence-position’, ‘attack’, ‘tackling’, ‘dribbling the ball’, ‘goal-kick’, and so on. One counter-intuitive aspect of this systems-view is that we appear to have lost that quintessential feature of motion - its directionality. However, we soon realize that it is not lost but embedded in the whole-system structures (for example, ‘goal-kick’ has no absolute direction (say, 30° West) yet it conveys all the directional information required, and makes reference only to features within the system - the goal posts).

Returning now to the PSA, the method is as follows: (i) Instead of particles we have *individual components* of the system. These are chosen in a system-specific way. (They can be billiard balls, atoms, planets, lever arm, pendulum, capacitor, and so on, as the given problem demands); (ii) Instead of forces there are ‘scalar structure functions’, what in a previous life we have called the *energy functions* (the kinetic energy, and the potential energy); (iii) we identify all the independent ‘motions’ that the system can undergo. These ‘motions’ are the physical changes that happen naturally and that characterize the given system - what in a later life we shall call the ‘degrees of freedom’. (For example, the planet orbits, the lever arm rotates, swings swing, and roundabouts turn.) (iv) We come finally to the application of a principle, a principle that requires an exploration of the ‘space’ in which the physical problem occurs. Knowing all the ‘motions’, we then choose an *alternative* set of ‘motions’ that *could* occur. These motions are hypothetical - we have hypothesized them - however we are not free to hypothesize anything we like: the ‘motions’ must all be in the same given ‘space’ (each system has its own ‘system-space’), occur in the given time-window, and they must be ‘nearby’. Now these ‘motions’ imply certain amounts of kinetic energy and potential energy consumed or generated in the given time. From these energies we compute a certain quantity - the total *action* - used up
(p.9)
in the given time. In short, we determine the total hypothetical action for this choice of hypothetical motions. We then continue the exploration and consider another choice of hypothetical motions and again determine the consequential hypothetical action. And so on. The principle then asserts that of all choices for hypothetical motions, *the actual motions are those which make the change-in-action-between-choices come out to zero*. More evocatively, the system finely adjusts itself, via the actual motions (acting in concert but instant by instant taking their marching orders from the scalar structure functions) in just such a way that the action used is least. That these subtly different versions - the italicized one and the evocative one - are the same will emerge during the course of this book.

Be reassured: these ideas are new and many and abstract; there is no way they can be understood in one go. It is useful to collect them together in one place, but not possible to convey all the nuances in a single paragraph. For example, we shall later on discover that sometimes the ‘space’ exploration is made explicitly by us (in the method known as the Principle of Virtual Work) and sometimes the mathematics takes care of it (in the Variational or Lagrangian Mechanics). Also, there is sometimes an elision made between the ‘motions’ as ‘degrees of freedom’ and the ‘motions’ as hypothetical ‘variations’.

Did we write ‘hypothetical’? Yes, this is the *piece de resistance*: the ‘system-space’ is a *virtual* abstract space, and this is what finally enables us to achieve the required objectivity (the actual physical space could have this, that, or the other observer-bias, whereas the virtual abstract space is neutral).

Here is a summary of the main virtues of the PSA.

(i) It does the job.

(ii) As forces play no part in the method then ‘forces-of-constraint’ also play no part. Incredible but true.

(iii) Better understanding of physics. We can now have ‘cat’ and ‘mouse’ instead of only ‘particles’ and ‘particle-particle interactions’. But a ‘cat’ is more than the sum of its ‘particles’.

(iv) No hard and fast distinction between ‘active’ and ‘passive’ components. (Just as a river carves out the river-bed, and the river-bed determines the path of the river, so the ‘curvature of space’ affects the paths of bodies, and moving bodies affect the ‘curvature of space’.)

(p.10) (v) Philosophically superior: local (and this is all we ever detect experimentally); no pre-existing empty ‘space’ (just take the world as it is and then examine it

^{8}); the system is what’s important.(vi) Global View. Even when ‘space’ is flat locally, ‘curvature’ can still arise globally - by patching together smaller regions with the requirement that there is no ‘puckering at the seams’.

(vii) The PSA gives prominence to energy and to the whole system. Kinetic energy is shown to be a more fundamental and primitive concept than force. Also the dichotomy between kinetic energy and potential energy is explained.

(viii) The PSA reveals a deep connection between symmetry and conserved properties. (Newton’s Mechanics does not lead naturally to any conservation laws except for one, the conservation of linear momentum.)

(ix) In Newtonian Mechanics no attempt is given to show how robust the solution is (how it changes following small changes in the starting conditions) or to give ball-park estimates. The PSA, via Hamilton’s Mechanics, does address these questions.

(x) Amazing unity of approach across almost the whole of physics. (Almost? The exceptions will be discussed in due course.)

Apart from the fact that the PSA is the keystone of physics, and therefore an indispensable tool for the professional engineer or physical scientist, there are two other reasons why we would like to awaken an appreciation of it, even a non-mathematical appreciation. The first is a pragmatic reason. We all know, roughly speaking, what space, time, mechanics, quantum mechanics, matter, and energy are about. These ideas have passed into the public domain. It would be inefficient to start from scratch in our science classes, and not even incorporate advances made during the seventeenth and eighteenth centuries. Somehow the PSA has got missed out - it is time to correct this. The second reason is aesthetic. The beauty of physics does not reside only in the beauty of the night sky, a rainbow, or a sunset. It resides even more in the interior logic, the principles which reach across vast areas of the physical (p.11) world, unifying them into a self-consistent whole, and with the most economical set of starting premises. (One could call it the ‘Aha’ feeling.)

Alice (from Lewis Carroll’s *Alice’s Adventures in Wonderland*) tried so hard to get through the door in order to see the exquisitely beautiful garden beyond. Consider this garden as a metaphor for physics: it’s true that without mathematics we shall never be able to wander freely around this garden, but it would be wonderful indeed if we could just be lifted up to peer at it through the keyhole.

## Notes:

(^{1})
D’Alembert, J le R, *Discours préliminaire de l’encyclopédie*, 1751.

(^{2})
When we say ‘reference frame’ we shall automatically mean a ‘*valid* reference frame’.

(^{4})
(But it could all have been so much more complicated; the force could have left the motion unchanged but caused the mass to swell, or it could have caused an acceleration not in line with **F**, or caused a third-order change in the position, and so on.)

(^{5})
(Again, it could have been more complicated, the force might have been cast anew for, say, each trio of particles.)

(^{6})
Newton, Isaac, *Philosophiae naturalis principia mathematica* (*The Mathematical Principles of Natural Philosophy*), 1687.

(^{7})
(There is then the third phase, when the theory is again rejected.)

(^{8})
This is more correct, especially when we remember that all our observations really have been carried out in the presence of large gravitating masses. Even where experiments are carried out in remote regions, the results still need to be brought back to Earth - in our present state of evolution.