Kepler's Planetary Laws
This account of Kepler's mathematical astronomy may well challenge some cherished and long-held beliefs, since most of what has been written about Kepler has either been based on secondary or tertiary sources, or has concentrated on his astronomical background and techniques. But Kepler was a highly-talented geometer, and until now has there been no investigation of his work (derived from the original Latin) which has highlighted the mathematical aspect of his brilliance. While analysis of his success has led to some unexpected conclusions, the present overview has been endorsed in detail by articles published in learned historical journals. These are listed on the web site mentioned at the end; they also contain references to traditional accounts which are becoming superseded.
The greatest achievement of Kepler (1571-1630) was his discovery of the laws of planetary motion. There were such three laws, but here we shall deal only with the first two - those that govern the motion of an individual planet. These are found in Astronomia Nova (New Astronomy, 1609), underpinned by important work in Epitome (of Copernican Astronomy) Book V (1621). The laws are:
From these we can find either what the position of the planet is at a given time, or, the time when the planet is in a given position. (The time is measured by the fraction of the total time taken for the planet to complete one whole circuit, that being called its period T. Kepler followed the ancients in always starting to measure at the point furthest from the Sun.) Almost certainly Kepler was responsible for introducing the term 'orbit', in Astronomia Nova Ch.1, and on his behalf we shall precisely define an orbit as possessing a pair of independent constituents: the path or curve, together with a (geometrical) way of representing time. (The term 'orb', used previously, had an entirely different connotation.)
- Law I (the Ellipse Law) - the curve or path of a planet is an ellipse whose radius vector is measured from the Sun which is fixed at one focus.
- Law II (the Area Law) - the time taken by a planet to reach a particular position is represented by the area swept out by the radius vector drawn from the fixed Sun.
Kepler's principal aim was to find a solution that would satisfy observations - and in that respect he possessed the outlook of a modern scientist. Moreover, Renaissance scientists considered it their duty to reveal the way in which Nature worked by explaining the observed outcome by the simplest means available. Strongly influenced both by Plato and by his underlying belief in God, Kepler believed more intensely than his contemporaries in the power of mathematics to expose the order in the universe that lay behind apparent complication, and he applied this criterion of simplicity with great effect in his astronomy.
There is much additional information, both on the circumstances of Kepler's life, and the context in which he worked, in the MacTutor biography. This will provide useful background to the present detailed account of his astronomical work.
Kepler originally investigated the orbit of Mars because that was the task allocated to him by Tycho Brahe (1546-1601), when Kepler joined him in Prague around 1600. In their day - and indeed until comparatively recently - the aim of astronomers was to achieve accurate observations of angles, simply because no other feature could be measured directly. Tycho had amassed a vast store of observations extending over 30 years; these are probably the most accurate that would ever be made with the naked eye, since Galileo (1564-1642) had introduced the telescope into astronomy soon afterwards (in 1610). Tycho developed, refined and cross-checked his instruments and sometimes attained an accuracy of 2' (which is approximately the breadth of a hair held at arm's length).
It so happens that Mars is the only planet whose noncircularity can be detected without a telescope, and its observability was favoured by four factors:
- it is an outer planet (and therefore it is seldom viewed close to the Sun);
- the noncircularity of its path is the greatest of the outer planets;
- it is the nearest to the Earth of the outer planets (so changes in position appear larger);
- it is the nearest to the Sun of the outer planets (and therefore it makes more frequent circuits, producing more observations).
Since Greek times, the accepted description of the planetary system had been a geometrical one, known as the Ptolemaic theory (a geocentric configuration), which supposed that the Earth was fixed at the centre of the universe, with the Moon, the Sun, and the five known (naked-eye) planets revolving round it. Geocentricity was obviously in accordance with the evidence of the senses - as well as being the only arrangement acceptable to the Church - by contrast with the heliocentric configuration, which was proposed by Copernicus (1473-1543). Kepler was introduced to Copernicanism as a student at the University of Tübingen by his teacher, Michael Maestlin (1550-1631). Though his contemporaries were in general slow to recognize any advantages in this new idea, Kepler adopted the Copernican theory enthusiastically, because of its greater simplicity - which allowed him to abandon the set of (five) large and cumbersome epicycles that occurred in the Ptolemaic theory (they accounted for what we now recognize as the actual motion of the Earth). In fact, Kepler gave Copernican theory a new, mathematical precision by specifying two fundamental tenets that were consistent with his conviction that the Sun was metaphorically the place of God:
Thus Kepler's interpretation of heliocentricity provided him with an origin from which to determine the Sun-planet distances and so discover the actual path of the planet. This was an immensely significant breakthrough in itself, and additionally it meant that he was freed from the restrictions of the precept (traditionally attributed to Plato), which had dominated the approach of previous astronomers, including Copernicus; this precept required that everything that happened in the heavens should be accounted for by uniform motions in perfect circles. Kepler's new astronomy was, indeed, founded on circles, but there was a different reason for this, as we shall explain in Section 5.
- The Sun is the fixed hub of the universe. This was bounded by the fixed stars and consisted of the six known (primary) planets, now including the Earth, with the Moon downgraded as its satellite (a term coined by Kepler himself);
- The Sun is responsible for all celestial motion.
Reduction of observations leading to idealization
In the earlier chapters of Astronomia Nova Kepler embarked on a programme of 'reducing the observations' (this term means removing, as far as possible, all effects due to the observer's position in time and space). He found the heavy calculations, and the necessary checking, 'mechanical and tedious', as he remarked later: he did not have the benefit of logarithms, which were not invented until 1614, by Napier (1550-1617). Kepler carried out the reduction to heliocentricity, and further simplifying procedures, in a series of steps:
To provide the foundation for his new approach to astronomy, Kepler adopted the simplest geometrical structure consistent with observations. Accordingly, when he had carried out the procedures listed above, he felt justified in assuming that a single, closed (repeating) planar curve would represent the path of the planet; this assumption implied a process these days called idealization (neglect of the presence of other bodies in the universe, as we now know). Such a structural simplification allowed Kepler to examine the orbit of each individual planet in isolation, because all mutual interactions between planets had been eliminated. And because it was expressed geometrically, the solution would potentially be exact - the closed orbit of a single planet in a plane round the fixed Sun.
- In order to transpose the observations from a geocentric to a heliocentric basis, he applied triangulation to ensure that each Mars-distance was measured as if from the fixed Sun. Bearing in mind that the observations contained no distance-measurements (as explained in Section 2), this involved expressing all the Mars-Sun distances in terms of the Earth-Sun distance, regarded as a standard unit or 'baseline' (since the path of the Earth is very nearly circular, this approximation happened to be accurate enough for Kepler's purpose);
- He verified that the path of each planet lay in a plane that passed through the Sun;
- He checked that the observations were compatible with the fact that the curve described by the planet possessed a single axis of symmetry - identified as the line of apsides;
- Finally, he tabulated the average values of the secular changes, from knowledge accumulated over many centuries,which accounted for many of the small remaining irregularities. Thus Kepler knew the angular amount he should allow as compensation for them.
Essential orthogonality of Euclid's geometry
In Kepler's day modern algebraic notation and techniques were just being developed, but for his approach to astronomy Kepler depended exclusively on the traditional geometry of Euclid in which he had been trained at the University of Tübingen, as part of the standard preparation for the ministry. (This was originally Kepler's intended profession, and all his life he remained a devout, though somewhat unorthodox, Lutheran). Euclid's Elements rigorously laid down (in the first three Postulates) that the only means of construction permitted were the straightedge (an unmarked ruler) and compasses. Thus, the distinguishing feature of the geometry of Elements was that it relied on straight lines and circles alone. These were then combined in one of Euclid's earliest propositions concerning circles, which stated in effect that where the diameter of a circle meets its circumference a right angle will occur. It is well-known that a pair of (mathematically-defined) directed quantities are mutually independent if and only if they are at right angles: using Euclid's term 'orthogonal' for mutually perpendicular (it was defined in Elements Book I), this will be named the Principle of Orthogonal Independence; it will, with hindsight, justify our separate treatment of the path and the time-measure. (Moreover, the same principle is invoked in relation to planetary motion when Kepler based his investigation on what Aristotle had specified as the only two simple motions, circular and rectilinear, discussed in Section 9.) This principle has far-reaching ramifications, as we will demonstrate in connection with the complementary pairings that recur in Kepler's mature work in Epitome Book V (1621) - where the term 'complementary' is used in the everyday sense that the pair complete one another, and also with the mathematical connotation of being at right angles. Application of the principle gives rise to an enormous leap in simplicity, since it is intuitively obvious that it will be easier to treat each one of a pair of components separately than to work with the resultant produced by taking them in combination. For Kepler, simplicity was the hallmark of his treatment, and contributed overwhelmingly to his success.
Kepler always showed the greatest respect for his Greek predecessors, and read their works thoroughly, selecting material that he could incorporate into his new astronomical synthesis. Apart from frequent application of the trigonometrical propositions of Ptolemy (fl. 129-141 AD), Kepler made use of precisely three propositions from the work of Archimedes; one of these was vital in supplying the geometrical backing for Section 6 (the other two - one cited in Section 7, one in Section 11 - were concerned with an innovative approach to 'infinitesimal' considerations which went well beyond traditional geometry). However, it will come as a surprise to some readers to find that Kepler did not rely on Apollonius anywhere in his astronomical work. Sometime in the years 1594-1604, Kepler studied the Conics of Apollonius, and expressed great admiration for it, citing it throughout his optical and stereometrical work - yet he never referred to any of its propositions in connection with his astronomy. This is because Conics is expressed in terms of an oblique (non-orthogonal) frame of reference (coordinate-system), which Kepler implicitly rejected as inappropriate for the study of astronomy (nor did he need any of its propositions, as we confirm in Section 6). Meanwhile we reiterate Kepler's belief that Euclid's Elements encapsulated the only geometry that could properly be applied to the heavens, which after all was the realm of God. He labelled any other treatment 'ageometrical' - which, in his mind, was akin to heretical.
Constructing the path
In spite of his splendid inheritance from Tycho, Kepler knew that no amount of empirical observations, however numerous, could give him the theoretical structure he required. Therefore, when he had compensated for the observational uncertainties as far as possible, Kepler switched to a geometrical investigation - see Figure (1). He began by assuming a fixed/known line of apsides CD, on which lies the fixed point A (the position of the Sun) at a known 'eccentric' distance AB (all previously determined from Tycho's observations), where B is the midpoint of CD. (From now on, as a convenience, we shall use the algebraic notation BC = BD = a, AB = ae, even though it is an anachronism.)
Kepler started from the initial framework illustrated in Figure (1), which could be described as standard Ptolemaic, except that Kepler automatically transposed it from geocentric to heliocentric mode. He adopted the traditional mechanism of deferent, epicycle, and eccentric, being aware, as the Ancients had been, that motion in the circle of radius a centred on A, when combined with motion in the epicyclet of radius ZQ = AB = ae (whose centre Z lies on the deferent), together produce a motion of Q equivalent to a simple motion of Q round the eccentric circle centre B radius a. (Mathematicians may like to regard ABQZ as a 'parallelogram of circular motions'.) We shall specify the typical point of any of the three successive orbits proposed by Kepler just as he did - determined by the angle at the centre of the eccentric circle, which we shall denote by β for distinctiveness. (This usage was authenticated by tradition, since in ancient astronomy motions consisted of combinations of rotations which were measured by the angles at the centres of their respective circles.) Then we have corresponding angles from the parallels AZ and BQ, so that:
∠QBC = ∠ZAB = β.This angle β is called 'the eccentric anomaly' in both ancient and modern astronomy (though we shall avoid that name here). The ordinate QH is also extremely significant in Kepler's reasoning, as we shall demonstrate, concentrating on the three main stages of Kepler's progress once he had adopted the approach which would provide a rational route to his goal.
At the first stage Kepler took Q on the eccentric circle as the typical point and so he tested the eccentric itself as a proposed path. However, he found that this placed the planet too far from the Sun in almost all positions. So he finally rejected the idea that each planet moved in a single circle, and set out to find the actual curve that was the planet's path - naturally, this had to be constructed from a combination of (arcs of) circles by the geometry of Euclid, since Kepler recognized nothing else as appropriate for the heavens.
The typical points of the three successive proposed paths are all named, and the separate constructions shown, in Figure (2). It is evident that the structure needed to produce these points already, or potentially, exists in Figure (1) in terms of a chosen value QBC = β. This is significant because - as we will see in Section 7 - the time is also expressed in terms of β, so a common value of β ensures simultaneity. The three-stage procedure that Kepler adopted was to take geometrically-defined points (K', K'', K) along AZ, one at each stage in turn, then with centre A to draw the corresponding circular arc (radius AK', AK'', AK), so that each arc would end at a geometrically-defined point (Q, V, P) respectively. The stages are classified according to the chapters of Astronomia Nova in whch each construction appears:
By careful comparison with Tycho's observations, as always, Kepler found that the first outcome (Q) was an overshoot, and the second (V) an undershoot. Finally, the point P more-or-less in the middle turned out to hit the target (to agree with observations). The martial analogy -- defeat of Mars, the god of war -- was Kepler's own, as was the description of the proposed non-circular curves he found, and named 'ovoid', or egg-shaped (always symmetrical about the line of apsides, but never with any assumption of a second axis of symmetry). Even when nothing more is known about these curves one can imagine applying a mathematical 'grading-machine' to sort them by width, to produce BF', BF'', BF, respectively, as shown in Figure (2) (constructed as special cases by the same procedure, taking β = 90°).
- First stage (Chs.39-44): large-grade curve - outer bound. In this special case, AQ = AK' is constructed, but the typical point is already known to be Q, on the eccentric.
- Second stage (Chs.45-50): small-grade curve - inner bound. K'' lies on the eccentric, and AV = AK'' is constructed, to specify typical point V lying on the epicycle. (Kepler tried many variations at this stage, but this is the only ovoid to be properly defined).
- Third stage: medial-grade curve (Chs.51-60) - observationally satisfactory. K lies where the perpendicular from Q meets AZ, and AP = AK is constructed, to specify typical point P lying on the ordinate QH.
So the resulting radius vector AP that finally satisfied Kepler (in Ch.58) was quantified geometrically from the constructed rectangle AKQR, by applying nothing more than a Euclidean - straightedge-and-compasses - construction, as shown in Figure (3):
AP = AK = QR = BQ + BR.This relationship is nowadays known as Law I.
[It is expressed in modern terms for reference in the Summary Table at the end: also see the article Planetary motion tackled kinematically.]
However, such a construction had never been invented before and Kepler did not have the slightest idea what curve the above relationship represented. However, he did know that if the curve were an ellipse, the typical point P would satisfy a vital condition that he had come across in the work of Archimedes: On Conoids and Spheroids Prop.4 (where it was stated as if well known even then):
PH/QH = BF/BCThere is good reason to believe that this was the earliest plane definition of an ellipse, (because it can be derived directly from a section of a cone in three easy steps [
1]), as well as the definition most commonly, if not exclusively, used by Kepler's contemporaries: it is just the ratio-property of the ordinates. Some people nowadays will recognize it as the 'compressed circle' property.
The identification of the curve as an ellipse also depends on a relationship that Kepler established in Ch. 59, Prop. VII, which we express here in modern terms (writing BF = b to denote the minor semiaxis), recognizing that e will now represent the (focal) eccentricity:
a2e2 = a2 - b2. (1)This may appropriately be called the 'focus-fixing property' of an ellipse, since it determines the position of the focus when the major and minor semiaxes are known. Kepler had already invented the term 'focus' in Astronomiae Pars Optica (1604) in connection with his work on vision, though he did not realize its connection with his astronomy at that juncture - in Astronomia Nova he simply referred to the point A as punctum eccentricum, or eccentric point. Nevertheless, the position of the focus of this particular ellipse is crucial. Unless its focus coincides with the fixed Sun (the origin), the investigation would have been too complicated to manage by geometry. Actually, Kepler's approach was successful just because the ellipse is simpler than any circle in this situation - an unlikely assertion which is proved in Planetary motion tackled kinematically.
In Astronomia Nova Ch. 59, Prop. XI, Kepler set out a rigorous geometrical proof that the typical point he had constructed satisfied the ratio-property which defines an ellipse. Hence, there can be no suggestion that Kepler merely selected an ellipse and checked it against observations (as many readers may have been told). Because of its importance the proof has been reproduced more than once [
2]; though modernized in style, and reordered - to increase its impact - clearly it has not been altered in substance, since it relies on nothing more 'advanced' than Pythagoras' Theorem, and properties of similar triangles. Thus (apart from the use of definition (1) above) the proof is entirely based on cited or implied propositions from Euclid's Elements. Incidentally, this provides additional confirmation of my contention (in Section 5) that Kepler did not rely on the Conics of Apollonius for his discovery of the ellipse - in his astronomy he simply did not need anything so sophisticated.
Kepler set out a rigorous geometrical proof that the typical point he had constructed satisfied the ratio-property which defines an ellipse. Hence, there can be no suggestion that Kepler merely selected an ellipse and checked it against observations (as many readers may have been told). Because of its importance the proof has been reproduced several times [
2]; though modernized in style, and reordered - to increase its impact - clearly it has not been altered in substance, since it relies on nothing more 'advanced' than Pythagoras' Theorem, and properties of similar triangles. Thus (apart from the use of relationship (1) above) the proof is entirely based on cited or implied propositions from Euclid's Elements. Incidentally, this provides additional confirmation of my contention (in Section 5) that Kepler did not rely on the Conics of Apollonius for his discovery of the ellipse - in his astronomy he simply did not need anything so sophisticated.
JOC/EFR October 2006
The URL of this page is: