De Moivre's Contributions to Probability and Life Contingencies (1718-1724)
Abraham de Moivre is another mathematician of the highest calibre who applied himself to probability whilst dedicating most of his energy to the more conventional mathematical fields of his time. In de Moivre’s case, he achieved particular renown in astronomy and complex analysis. He is perhaps most well-known for his eponymous formula on complex numbers and trigonometric functions, which is a precursor to the famed Euler formula (which frequently tops surveys of the most beautiful formulae in mathematics). But his contribution to probability is also highly regarded. Isaac Todhunter, that most authoritative of nineteenth-century mathematical historians, writes of de Moivre, ‘it will not be doubted that the Theory of Probability owes more to him than to any other mathematician, with the sole exception of Laplace’.30
Born in France in 1667 to a Huguenot (Protestant) family, de Moivre moved to London in 1687 to escape religious persecution. He lived in London for the remainder of his life, and died in 1754. Despite being a recognised leading mathematician and Fellow of the Royal Society, he was never able to obtain a major position at one of the universities. He spent his entire life in relative poverty and supported himself by working as a mathematical tutor.
De Moivre is notable in the history of actuarial thought for the publication of two books that span the related fields of probability and life contingencies: Doctrine of Chances,3 which was first published in 1718; and Annuities on Lives,32 first published in 1724. The latter is the main focus of our interest, but Doctrine of Chances is remarkable for including the first published derivation of the normal probability distribution. De Moivre considered Jacob Bernoulli’s binomial probability distribution and, like Bernoulli, he considered the properties of samples from the distribution as the sample size increased to very large numbers. The factorial calculations entailed by the binomial distribution for large n were extremely unwieldy given the computational limitations of the time. De Moivre was able use factorial algebra to derive the normal distribution as a good approximation to the binomial distribution for large n. (This calculation was further refined by James Stirling, and now bears Stirling’s name.)
At the time of publication of Doctrine of Chances, the wider implications of the normal distribution were not yet understood. De Moivre’s work can now be seen as a special case of the Central Limit Theorem that was to emerge at the start of the nineteenth century. But it was still a very useful result at the time of publication: it substantially reduced the computational burden of calculating the probabilities associated with large sample sizes. It also provided new conceptual insights. In particular, de Moivre’s work highlighted that the variability of a sample probability calculated from a known population will   
decrease in proportion to the square root of the sample size. The primitive statistical practices of the time, such that they existed in areas like the treatment of discordant astronomical observations, had assumed that the sample variability decreased in direct proportion to the sample size, rather than its square root.
The section of Doctrine of Chances containing the derivation of the normal distribution is entitled A Method of approximating the Sum of Terms of the Binomial (a + b)n expanded into a Series, from whence are deduced some practical rules to estimate the degree of assent which is to be given to experiments’. De Moivre believed that his approximation result supported the inversion of Bernoulli’s Theorem for use in statistical inference or induction—that is, to make statements about the likely behaviour of a population based on the behaviour of a sample, as opposed to making statements about the behaviour of a sample, based on knowledge of a population’s characteristics. De Moivre argued from Bernoulli’s Theorem that if a large sample could be observed to ‘converge’ on a given probability, then that must provide the unknown population probability. This idea of convergence recognised that the sample ratio is an unbiased estimator of the population probability, but de Moivre did not suggest any means of determining how large the sample must be to obtain convergence, or how much uncertainty is in a given sample when the population probability is not known. De Moivre’s uncovering of the normal distribution is arguably a profound moment in probability. The mathematics of statistical inference, however, was not directly furthered by it.
De Moivre’s Annuities on Lives, published in 1724, was the most significant work on life contingencies since Halley’s seminal Breslau paper. De Moivre did not seek to improve on Halley’s mortality table. Rather, his objective was to show how Halley’s work could be more readily applied to the valuation of annuities by reducing the computational burden associated with calculating annuity prices from age-specific mortality rates. Halley himself had pointed out in his Breslau paper that the calculation of annuity prices from the mortality rates of his table ‘will without doubt appear to be the most laborious calculation’ and noted that the production of the annuity table in his paper was ‘the short result of a not ordinary number of arithmetical operations’. De Moivre specified a simple mathematical form for the behaviour of mortality as a function of age that permitted a straightforward annuity pricing formula to be obtained which required a much smaller set of arithmetic operations:
Consulting Dr Halley’s Table of Observations, I found that the Decrements of
Life, for considerable intervals of time, were in arithmetic progression; for
instance, out of 646 persons of twelve years of age, there remain 640 after one year; 634 after two years; 628, 622, 616, 610, 604, 598, 592, 586, after 3, 4, 5,
6, 7, 8, 9, 10 years respectively, the common difference of those whole numbers being 6.
De Moivre derived a relatively simply pricing formula for a single life annuity when the decrements of life followed this arithmetic progression:
where r is the annually compounded rate of interest, n is the number of years until the number of lives is reduced to zero by the arithmetic progression, and P is the price of an annuity certain payable for n years. To set the n parameter, de Moivre assumed that the population would be reduced to zero by age 86, and so n = 86 - x, where x is the current age of the annuitant.
Of course, the whole of Halley’s mortality table did not conform to de Moivre’s assumption of a constant arithmetic progression of the decrements of life. Figure 1.3 shows the actual decrements produced by Halley’s table.
This highlights that de Moivre was being somewhat selective in his quoting of the stability of the decrement of life in Halley’s table. It is indeed constant for the ten years from age ten, as de Moivre points out (and also for fifteen years from age 54), but there inevitably must be significant variation—first upwards as mortality rates rise with older age; and then downwards as the population reduces to very small numbers.
How do the limitations of de Moivre’s approximation of Halley’s mortality assumptions impact on annuity pricing? Consider the price of an annuity on a 30-year-old life. De Moivre’s approximation assumes that 9.5 lives of a starting number of 530 die every year for the next 56 years. Figure 1.3 shows that the actual decrement in Halley’s table was eight per year at age 30, rising to ten or eleven over the next few decades, before falling sharply from age 75 onwards. So de Moivre’s assumption is sometimes higher and sometimes lower than the ‘true’ values of Halley’s table. The resultant approximate annuity price is surprisingly accurate—de Moivre’s price is 11.6, and the exact price from Halley’s table is 11.7 (at 6 % interest). Figure 1.4 shows how de Moivre’s approximation compares with the exact prices from Halley’s table for various ages of annuitants. It is less accurate for young ages, but impressively accurate for annuitants of age 30 or older.
Fig. 1.3 Decrements of life from ages 10 to 82 (from 1,000 births); Halley's Breslau table
Fig. 1.4 A comparison of de Moivre's annuity price approximation with exact values calculated from Halley's Breslau table (at 6 % interest)
The remainder of Annuities on Lives is dedicated to the pricing of annuities on multiple lives. He first focused on finding pricing formulae for joint life annuities on two and three lives (where the annuity payments cease on the first death). He again focused on finding computationally tractable formulae by specifying an assumed mathematical form for the behaviour of mortality.
By doing so, he was able to find joint life annuity prices that were simple functions of single life annuities such as:
where M is value of the single life annuity on the first life; P is the value of the single life annuity on the second life; and r the rate of interest.
The derivation of the above joint life annuity price relies on a different assumption for the mathematical model of mortality than the one he used in developing the single life annuity pricing approximation. Instead of assuming that the decrements of life follow an arithmetic progression, he here assumes that they follow a geometric progression. Expressed in modern actuarial terminology, he assumes q is constant across age (though it does not need to be the same q for each of the lives).
The assumptions used to derive M and P above, and the assumptions used to derive the above joint life formula are not only different, they are inconsistent and mutually incompatible. But this does not necessarily matter—de Moivre was not deducing a theorem, he was attempting to find a computational shortcut that produced reasonable estimates. So how effective was this formula against this criteria? Suppose we obtain the single life annuity prices M and P above using de Moivre’s single life formula, and then plug these prices into his joint life formula—how does such a joint life annuity price compare with the exact value implied by Halley’s table? Inevitably, the additional layer of approximation amplifies the errors in the valuation. For example, consider the joint life annuity price on two 30 year-olds. The exact price according to Halley’s table (and at 6 % interest) is 9.5. And whilst de Moivre’s single life annuity price provides a very close approximation to the exact single life price (11.6 versus 11.7), his joint life annuity price is 8.9, whereas the exact price 9.5. Nonetheless, in the great scheme of approximations involved in obtaining an early eighteenth-century annuity price, perhaps 5 % or so is not too bad.