The Trenches of Discovery: how science is done

[The following is a guest post from Bjoern Malte Schaefer (see his last guest post here). Bjoern is still one of the curators of the Cosmology Question of the Week blog, which is also still worth checking out. Enjoy!]

Introduction

The aim of theoretical physics is a mathematical description of the processes taking place in Nature. Science is empirical, meaning that its predictions need to comply with experimental results, but other categories are very important (but not decisive): Theoreticians look for elegance, consistency and simplicity in their descriptions, they aim for abstraction and unification, and look for reduction of the laws of Nature to a few fundamental principles and at the same time for analogies in the description of different phenomena. The subject of this article is to show how these aspects are realized in classical physics, although many of the arguments apply to relativistic physics as well - or only show their true meaning in this context. I do apologize for some of the mathematics, and I promise to keep it as compact as possible.

Formulation of physical laws with differential equations

Physical laws are formulated with differential equations, which relate the rate of change of a quantity to others, for instance the rate of change of the position with time to the velocity. This rate of change is called a derivative. The solution to these equations usually involve an initial value of a quantity under consideration, and compute the value at each instant in solving the equation. The fact that the laws of physics are formulated with differential equations is very advantageous because they separate the problems of evolution of physical systems from the choice of initial conditions for the evolution. Using differential equation for e.g. deriving the motion of planets leads to the abstraction to what forces planets are subjected and how they move under these forces. It predicts naturally the orbits of planets without fixing a priori the orbits themselves, as for example Johannes Kepler might have thought.

Classical mechanics

Let’s discuss a straightforward example: The motion of a body under the action of a force in Newtonian dynamics. Newton formulated an equation of motion for this problem, which stipulates that the acceleration of a body is equal to the force acting on it, divided by the mass of the body. If, in addition, the acceleration is defined as the rate of change of velocity with time and the velocity as the rate of change of position with time, we get the usual form of Newton’s equation of motion: The second derivative of position is equal to the force divided by mass: This is the prototype of a differential equation. It does not fix the trajectory of the body (the position of the object as a function of time) but leaves that open as a solution to the differential equation under specified initial conditions (the position of the object at the starting time).

Already in Newton’s equation of motion there are two very interesting details. Firstly, the solution to the equation without any force is found to be one with a constant velocity, or with a linearly increasing coordinate, which is known as inertial motion. And secondly, the equation of motion is a second order differential equation, because of the double time derivative. This has the important consequence that motion is invariant if time moved backwards instead of forwards.

Classical gravity

A generalisation to this idea is the classical description of gravity. In a very similar way, the gravitational potential is linked through a second-order differential equation to the source of that field, i.e. a central mass. How would this work by analogy? In the mechanics example above, the source of motion was given by the force and both were linked by the second derivatives. Here, the second derivatives of the potential are linked to the sourcing mass again by a second order differential equation, which in this context is called the Poisson-equation, named after the mathematician Denis Poisson.

Would this idea work in any number of dimensions? It turns out that one needs at least three dimensions to have a field linked to the source by a second-order differential equation, if the field is required to vanish at large distances from the source and if the field is symmetric around its source, which are all very sensible requirements. Surely the gravitational field generated by a point mass would be the same in every direction and the attracting effect of the gravitational field should decrease with increasing distance.

Is there an analogy to the forward-backward-symmetry of Newton’s equation of motion? The field equation is invariant if one interchanges the coordinates by their mirror image, therefore, Nature does not distinguish between left and right in fields, and not between forwards and backwards in motion. These are called invariances, in particular the invariance of the laws under time-reversal and parity-inversion. And finally, there’s an analogy to inertial motion, because no gravitational field is sourced in the absence of a massive object. The mass is the origin of the field in the same way as force is the reason for motion.

Variational principles

Joseph Louis Lagrange discovered a new way of formulating physical laws, which is very attractive from a physical point of view and which is easily generalizable to all fields of physics. How it works can be seen in a very nice analogy, which is Fermat’s principle for the propagation of light in optics. Clearly, light rays follow paths that are determined by the laws of refraction, and computing a light path using Snell’s law is very similar to using Newton’s equation of motion: At each instant one computes the rate of change of direction, which is dictated by the refractive index of the medium in the same way as the rate of change of velocity is given by the force (divided by mass). But Fermat formulated this very differently: Among all possible paths leading from the initial to the final point light chooses the fastest path. This formulation sounds weird and immediately poses a number of questions: How would the light know? Does it try out these paths? How would the light ray compare different paths? It is apparent that Fermat’s formulation is conceptually not easy to understand but one can show that it leads to exactly the right equation of motion for the light ray.

Lagrange’s idea was to construct an abstract function in analogy to the travel time of the light ray, and to measure a quantity called action. Starting from his action he could find a physically correct equation of motion by constructing a path that minimizes the action, in complete analogy to Fermat’s principle. Lagrange found out that if one starts in his abstract function with squares of first derivatives of the dynamical quantities, they would automatically lead to second order equations of motions, so the basic parity and time-reversal symmetries are fulfilled. In addition he discovered, that if he based his abstract function on quantities that are identical to all observers, he could incorporate a relativity principle and make a true statement about a physical system independent from the choice of an observer.

Universality

The formulation of the laws of physics with differential equations is very attractive because it allows to describe different solutions that might exist for a physical problem. For the motion of the planets around the Sun there is a universal mathematical description, and the planetary orbits themselves only differ by choosing different initial conditions for the differential equation. There is, however, yet another feature present in the equation of motion or the field equation, which is related to Lagrange’s abstract description.

Clearly, any description of a process must be independent if the length-, time- and mass-scales involved are changed: This feature is referred to as universality or mechanical similarity, because it allows to map solutions to the equation of motion onto others. For instance, the orbit of Mercury would be a scaled version of the orbit of Neptune, the orbits can be mapped onto each other by a redefinition of the length- and time-scales involved. This was considered be an essential property of the laws of physics, because it implies that problems fall into certain universality classes and that there is no limit of validity of the solution. Coming back to the problem of the motion of objects in gravitational fields one finds Kepler’s third law, which states that whatever the orbit of a planet, the ratio between the third power of the orbital radius divided by the orbital time squared is always a constant. It is completely sufficient to solve the problem of an orbiting planet in principle, the orbits of other planets do not even require solving the differential equation again (with different initial conditions), but all possible solutions follow from a simple scaling operation. A more comical example are astronauts walking on the surface of the moon with much smaller gravity: their movements appears to be in slow motion, but speeding up the playback would show them to move perfectly normal.

Relativity

The last question is of course what the true meaning of Lagrange’s abstract function should be: It is very successful in deriving physically viable equations of motion and field equations, but before the advent of relativity it was unclear how it should be interpreted: It turns out that the Lagrange-function of moving objects is the proper time and that the Lagrange-function of the gravitational field is the spacetime curvature. Objects move along trajectories that minimize the proper time elapsing on a clock moving with that object, and the gravitational field is determined as the minimal curvature compatible with a source of the field. These interpretations require that spacetime has at least four dimensions (instead of three), and they lead to viable second-order differential equations respecting time-reversal and parity-invariance. Both quantities, proper time and curvature, are invariant under changes of the reference frame, so relativity is respected, and are invariant under choosing new coordinates - this is in fact the expression of universality. And one has learned one additional thing, which must appear beautiful to everybody: The laws of Nature are geometric, a very complicated, position dependent geometry, whose properties are defined through differential equations. The lines of least proper time are straight in spacetime in the absence of a force, and considering gravitational fields in cosmology it is even the case that the expansion of space is constant an empty universe, both as a reflection of inertial motion. But there is one new phenomenon: Gravitational fields do not vanish at large distances as Newton thought, rather, they start increasing at distances above 10^25 meters, where gravity becomes repulsive under the action of the cosmological constant, and this feature brakes scale invariance.

Summary

The formulation of the laws of Nature led physicists to a geometric description of physical processes in the form of differential equations, and variational principles are a very elegant way of formulating the origin of equations of motion and field equations. The true meaning of the variational principles only became apparent with the advent of relativity. It is even the case that other forces, like electromagnetism, the strong and the weak nuclear force have a analogous description, involving an abstract geometry on their own. Finally, it was realised by Richard Feynman that the way in which Nature realizes variational principles was through the wave-particle duality of quantum mechanics - but that is really the topic of another article.

If I could recreate the way research results are quality checked and revealed to the world, I would probably change almost all of what is currently done. I think the isolated scientific paper is a product of the 20th century, being imposed on the 21st purely because of inertia. A better solution would be to give a "living paper" to each general research project an individual researcher has. This living paper can then be updated as results change/improve. In such a system I would probably have ~5 living papers so far in my career, instead of ~20 old-style papers. Or, even better, would be a large wiki edited, annotated, moderated and discussed by the science community as knowledge is gained.

Even if you to wish to keep "the paper" as how science is presented, I think that the journal system, while invaluable in the 20th century, also exists in the 21st century only due to inertia. Pre-print servers like the arXiv are already taking care of the distribution of the papers, and the peer review, which is responsible for the quality check side of things, can (and might?) be organised collectively by the community on top of that. But why should we stick with peer review anyway? Could there be a better way?

Firstly, let me stress, peer review is definitely an incredibly effective way to progress knowledge accurately and rapidly. The best ideas are the ones that withstand scrutiny. The better an idea is, the more scrutiny it can withstand. Therefore, holding every idea up to as much scrutiny as possible is the best way to proceed. However, by peer review I simply mean criticism and discussion by the rest of the scientific community. I think the way peer review is currently done, at least what people normally mean by "peer review" is very nearly worthless (and when you factor in the time required to review and respond to review, as well as the money spent facilitating it I'd be tempted to claim that it has a negative impact on research overall). The real peer review is what happens in informal discussions: via emails, at conferences, over coffee, in the corridor, on facebook, in other papers, etc. The main benefit the current method of peer review has is simply that the threat of peer review forces people to work harder to write good papers. If you removed that threat, without replacing it with something else, then over time people would get lazy and paper quality would degrade, probably quite a lot.

But that would only happen if the 20th century form of peer review was removed without replacing it with something from the 21st century. I wrote above that the real form of peer review happens through conversations at conferences, in emails, etc. The rapid access to papers that we get now makes this possible. In the early-mid 20th century, because the (expensive) telephone was the only way to rapidly communicate with anyone outside your own institute, word of mouth would spread slowly. Therefore some a priori tick was needed, that confirmed the quality of a paper, before it was distributed; hence peer review. But now communication can and does happen much more rapidly. Today, if a paper in your field is good, people talk about it. This gets discussed in emails amongst collaborators, which then disperses into departmental journal clubs and the information about the quality of the paper is disseminated like that. It's worth emphasising that, at least in high energy physics and cosmology, this often happens long before the paper is technically "published" via the slow, conventional peer-review.

However, this information probably still doesn't disseminate as widely or as quickly as might be ideal, given the tools of the web today. What would be ideal is to find a way for the discussions that do happen to be immediately visible somewhere. For example, what if, instead of having an anonymous reviewer write a review that only the paper's authors and journal editor ever sees, there was instead a facility for public review (either anonymous or not), visible at the same site where the paper exists, where the authors' replies are also visible, and where other interested people can add their views? The threat of peer review would still be there. If a paper was not written with care, people could add this in a review. This review would remain unless or until the paper was revised. Moreover, negative reviews that would hold up a paper could also be publicly seen. Then, if a reviewer makes unfair criticisms, or misunderstands a paper, the authors could makes this clear and the readers can judge who is correct. Or, even better, the readers can add to the discussion and perhaps enlighten both the authors and the reviewer (with words that all other readers can see)!

The Trenches of Discovery

Pages

Thursday, April 7, 2016

The shape of physical laws

Wednesday, August 26, 2015

Hypothesis: The future of peer review?