CHAPTER VI MORTALITY TABLES FROM CENSUSES AND DEATH REGISTERS IN the preceding chapters we have dealt with the treatment of data from which we can obtain the history of each individual case, and the approxima- tions resorted to for the sake of convenience may be relied upon to give very accurate results. We may now consider briefly the problem of obtaining the rate of mortality from census records and mortality returns where it is impracticable to observe individual cases. We must still keep clearly before our minds that our final object is to obtain the rates of mortality at each age, and as we cannot obtain exactly the elements in our formula we must endeavour to find a suitable basis for approximation. For this purpose we may go back to our definition in Chapter I and consider the numerical example in which we imagined 10,000 persons aged 30, each of whom was kept in sight until he attained age 31 or until death, if it occurred before 31. We concluded that if there were 104 deaths the rate of mortality at age 30 was -0104. If these deaths occurred regularly during the year we could dispense with the enumeration of the people at the outset, as by counting those surviving to age 31 or to any age 62 CENSUSES AND DEATH REGISTERS 53 between 30 and 31, and adding the deaths after age 30 up to that time we should obtain the number commencing at age 30. Let us assume that the figures related to the calendar year 1912, and see what would have been the position of the community of 10,000 persons on 1st July when 6 months of the year under consideration had elapsed. Fifty-two persons would have died, and there would remain a population of 9948 persons aged 30^ years. A census taken on that date would give this figure, and we could deduce the exposed to risk by adding 6 months' deaths, i.e. 9948+52 = 10,000, and the rate of mortality would be 10410,000 =-0104 as before. This shows us that if we are given the number of persons aged 30^ alive on 1st July in any year, and the number of deaths during the same year at age 30 last birthday, we can find the rate of mortality. In practice, of course, it is quite impossible to trace a number of people, all born on 1st January in the same year, but if we take the population aged 30 last birthday on 1st July and the deaths aged 30 last birthday during the year, we can say that on the average the birthday will be 1st January both for population and deaths, and our results will be very close to the truth. Similarly, we cannot find a stationary population, but we may assume that the population in the middle of the year will be the average population living during the year; in other words, we shall arrive at approximately the same figure if we take the result of a census on 1st July as if we traced the members of the population 54 MORTALITY AND SICKNESS TABLES individually and made allowance for the time they were members of the community the mortality of which we are investigating. By taking the population on 1st July instead of 1st January we should thus in ordinary circumstances overcome the difficulty of finding a number of persons born on the same day, and we should also make allowance for variations in the population during the year due to migration and increase or decrease consequent on a changing birth- rate in former years. Put generally, then, the rate of mortality at any age is equal to the number of deaths in one year divided by the population at the middle of the year increased by half these deaths, the people included in each count being those who attained the age in question on their last birthday. The method we have outlined, although making some allowance for variations, assumes that there has been no violent fluctuation in the number of people in the community, or that variations before and after the middle of the year when the census is taken counterbalance. In such circumstances the method will give good results, but it is awkward to apply in practice, because censuses are not made yearly on 1st July, and it is preferable to average the deaths over a number of years to avoid fluctuations due to epidemics, etc. These difficulties can, however, be overcome to a very great extent if the population is not subject to erratic variations by using one census and the average number of deaths for three years. This simple method often gives satisfactory results, but an alternative method, which has been more commonly CENSUSES AND DEATH REGISTERS 55 used of recent years, is to take the mean population based on two censuses for the population in the formula for the rate of mortality, and the deaths used are those occurring during the period between the two censuses. The reader will see at once that by using more than one census it is easier to decide whether the population is changing, and to what extent. An extreme case will show the danger of using only one census: Consider, for instance, a newly populated district in some out-of-the-way part of the world which was unpopulated up to July in a certain year, when immigration started, and 100 people settled in the district, and before the end of the year two of them died. If a single census had been taken at the middle of the year, it would appear that there was no population, and yet two deaths would have been recorded 1 This is, of course, a ridiculously extreme case, but the use of two censuses and the 'deaths for the intervening period would save one from a bad error even here. Our next problem, then, is to find the mean population during the period between the dates of the two censuses, and we must first see exactly what we understand by this term. It is simply the average of all the populations there have been during the period. If we want to find the batting average of a cricketer we add together all the runs he has made and divide the result by the number of completed innings. In the same way we should get our mean population by adding together all the populations and dividing by the number of enumerations. Unfortun- ately, however, while a cricket innings is complete in 56 MORTALITY AND SICKNESS TABLES itself, and is divided by well-marked intervals from other innings, a population is in a state of constant change and varies from day to day on account of deaths and removals. In order to obtain an exact result, therefore, we should have to take a census every moment, whereas in actual practice the census at least in this countryis taken only once in ten years! We are thus faced with the difficulty of deciding from the results of two censuses, separated by an interval of ten years, what has been the popula- tion at different times within the ten years, or, which is practically the same thing, in what manner the population has varied. It will be seen at once how awkward this is: because it would be very difficult to guess a mean population correctly from these two censuses alone. Some idea might perhaps be obtained from the deaths, because the deaths depend on the population; but in practice the helpfulness of the deaths is discounted to a large extent by their fluctuations, which disguise variations due merely to the changes in the population. The following diagram may help to make the position clearer:
CENSUSES AND DEATH REGISTERS 57 A D and B C represent the population of a community in 1901 and 1911 respectively, and we want to find the average population during the period. Another way of setting the problem is to say that we know A D and B C and we want to know the area of the whole block A B C D, but do not know how the line A B ought to be drawn. We could draw it in so many ways. We could make it a straight line ; or a curve, running either above or below the straight line; or we could make it twist about in all sorts of ways. A few of the many possible solutions are shown in the diagram by dotted lines. Think how different the area is in almost every case, and how unlikely we should be to guess the right distribution of population which is shown by the complete line. The problem would be very much easier if we were given E Fthe population in 19 0 6but unfortunately this is not available. The result of this difficulty has been that people have " theorised," and the theory was set out that the population increased in geometrical progression. This theory seems to rest on the idea that population begets population, and that therefore if the population is 100 this year and double this number next year, it will again double itself the next year, and so on. The reader must remember that this is simply assumption, but it is convenient in so far as it enables us to find the population at any moment, and so obtain the sum of all the populations without much trouble. In some cases where it can be tested it seems not unreasonable. It can be very far out in other cases, and a decrease in marriage-rate, postponement of age of 58 MORTALITY AND SICKNESS TABLES marriage, a changing death-rate, or many other causes, could upset the geometrical progression. This can be shown by taking a somewhat extreme case. There are certain populations which are now decreasing, although at one time they were increasing. At some time between say 1903 and 1913 the population changed from an increasing one to a decreasing one. The population was in fact like the top line of our figure, and the geometrical progression which is below the straight line gives a worse result than the straight line which represents an arithmetical progression. A geometrical rate of increase is, however, fre- quently assumed, and the consequent mean population y. _ ^ is P . x(-434 . . .), where P is the population logr at the first census, rP the population at the second census, and the logarithm is the common logarithm to base 10.1 If, however, instead of assuming a geometrical progression it is assumed that the population was increasing in arithmetical progression, the mean is one-half the sum of the populations at the two censuses, and in cases in which the assumption is correct this is the same as the population at the middle of the term. In order to test the effect of these different 1 The formula is obtained by the application of the integral calculus. On the given assumptions the population at any time is Pr', and the mean population during the ten years is therefore fy^-^r CENSUSES AND DEATH REGISTERS 59 assumptions let us consider the case of a population of 10,000 on 1st January 1901 which had increased to 20,000 by 1st January 1911. The mean population would be 14,428 on the assumption of a geometrical progression and 15,000 on the assumption of an arithmetical progression, and if we further assume that 2250 deaths occurred during the ten years 1901-1910 the corresponding rates of mortality would be '0155 and '0149 respectively, showing a difference of 4 per cent. In calculating these rates we must take one-tenth of the 2250 deaths, because these deaths relate to ten years. This is an extreme example, of course, but it shows that the assump- tions made may have considerable influence on our results. So far we have assumed that we are dealing only with one age, but in practice, as the published census results show the numbers living between certain ages and not the number at each age, we have to calculate mean populations for these groups as well as for the total population, and we have afterwards to spread out the data so that rates of mortality can be found for every age. It will seem at first sight that there is little difficulty in the first point, because we can calculate the mean population for each group inde- pendently ; add up these mean populations and treat the total as the mean for the population as a whole. If we assume that the population in each age group is varying in arithmetical progression, then the sum of the separate mean populations obtained for the different groups will be equal to the mean 60 MORTALITY AND SICKNESS TABLES population found from the total number living at the two census dates. If, however, we work on the basis of a population increasing in geometrical progression, we shall find that unless the rate of increase in every group is the same, we shall not necessarily obtain this equality. This result is not only to be expected when the nature of the assumption made is considered, and does not, in itself, condemn the method of taking the geometrical mean population of each age group separately. It has, however, been considered by some a defect in the method, and in the latest Population Tables in this country the assumption of a geometrical progression has been made with regard to the total population only, the mean population for the different age groups being obtained by assuming that the pro- portion of the population in any group to the total varies in arithmetical progression. It can be seen that by this device the objection is overcome; and further, that if we work on the ratios in this manner we shall always get equality between the sum of the mean grouped populations and the mean of the whole population, no matter what assumption we make as to the manner in which the population as a whole is varying. Having thus calculated the mean population for the groups of ages given in the census returns, we have now to spread out the results in some way or other so as to obtain deaths and populations for each age on which to base our rates of mortality. Table XVII gives the facts for a small portion of a certain experience. CENSUSES AND DEATH REGISTERS 61
TABLE XVII
The data are shown graphically on a small scale in Figs. A and B. The rectangular blocks are propor-
Age 30 40 50 60 70 80 90 100 FIG. A.POPULATIONS. tional to the populations and deaths, and the lines running through them enable us to distribute the
62 MORTALITY AND SICKNESS TABLES figures over the ages in the groups. As we are merely distributing we must see that the line drawn does not lessen the size of any blockin other words, the population read off from figure A for any one of the original groups will be the same whether we use the curved line or the straight one, but the population
Age 30
60 70 80 90 FIG. B.DEATHS.
for any smaller age group will differ. The objections to the method are (1) that it is quite possible to draw many lines satisfying the necessary condition, and each line gives a different distribution and different values for the rates of mortality; (2) that it is quite likely that the figures obtained will not run very smoothly, and may require some adjustment after- wards if the results are to be useful in practical cal- culations ; (3) that it is not easy to read off the areas CENSUSES AND DEATH REGISTERS 63 from the curve with sufficient accuracy unless the drawing is on a very large scale. The method is, however, so simple that it has often been used; sometimes with considerable success. The Carlisle table was calculated on these lines by Joshua Milne in 1815, and the method is sometimes called by his name. Another method is to assume that the rate of mortality deduced from the group, is the rate for the central age of the group and to obtain other values by interpolation. This method is short and easy, but it is objectionable because it introduces systematic errors. These errors arise because the mortality increases with the age, and the rate of mortality for the group does not therefore give the rate for the central age of the group. The method generally adopted is to interpolate by means of one or other of various algebraical methods that are available between the figures given for populations and deaths, and so produce figures that can be used for individual ages. Two alternatives are available in the application of the formula chosen. We may either split up the population in each age group to find the number at each age, or we can find the total population over certain ages by summation of the grouped figures, interpolate to find the population over each age, and then, by differencing the results thus obtained, arrive immediately at the number living at each age. Thus, in the example given on page 61 we can find the popu- lation at age 52 by dividing 5012 into 10 parts in some manner, or by summing the figures in column 64 MORTALITY AND SICKNESS TABLES 2 to obtain the total population over 50, over 60, over 70, etc., interpolating to find the total population over 51, over 52, over 53, etc., and taking the differ- ence between the total population over 52 and the total population over 53. The latter method is more convenient and, with the modification described below, is usually employed. It might also be adopted with the graphical method referred to above, but its utility is discounted by the immense scale on which the work has to be done to enable the calculator to read off from the graph the additional figures in- volved. On the other hand, it would simplify the reading from the graph, because we should read off the heights of the curve at each age instead of calcu- lating the areas. We may save a little more work yet in arranging the function for interpolation. The reader will re- member that the rate of mortality or chance of dying in a year is Deaths Population + ^ deaths, the chance of living a year is therefore the difference between this and unity, namely Population ^ deaths Population + ^ deaths One advantage of this second form over the first is that the numerator and denominator are more alike, and the same method of interpolation can therefore more properly be used for the two functions; another advantage is that a certain amount of CENSUSES AND DEATH REGISTERS 65 arithmetic is saved because we only calculate " Popu- lation + ^ deaths " and " Population ^ deaths " for a few age groups, and the interpolation and differenc- ing then give the numerator and denominator for finding the probability of living a year without any further calculations. If we work with the deaths and populations we have afterwards to calculate the " Population + ^ deaths " for each age, which entails over 100 calculations. Even the process of interpolation is not altogether easy, as we have to complete a long series of values, and it would be impracticable to use all the given terms for this work. Interpolations are therefore made from various selections of terms, and the final results are artificially blended, or, preferably, the ordinary methods of interpolation are discarded and a method is used which, though a little artificial, gives continuity without the trouble of any independent blending process. The above methods give good results for the main portion of a mortality table, but they are unsuitable for the first few years of life. During infancy and childhood the rate of mortality changes very rapidly, and it is found that the ages of children are frequently stated inaccurately, so that it is practically impossible to adopt satisfactorily any method of spreading the facts recorded, for the first groups of ages. The ex- posed to risk have consequently to be calculated from the recorded births and deaths. We will assume that the deaths at each age are available for the ten years 1901 to 1910, and that they can be obtained for years prior to 1901 if required. Of the 5 66 MORTALITY AND SICKNESS TABLES deaths at age 0, i.e. those occurring before age 1, among children born in the year 1900, some will have occurred in 1900 and the remainder in 1901; and of the deaths at the same age among children born in 1901, some will have occurred in that year and the rest in 1902. Approximately, then, the deaths occurring in 1901 at age 0 must have been one-half of the total deaths under age 1 among children born in 1900 and 190.1, or we may say that the exposed to risk corresponding to the deaths at age 0 in 1901 will be one-half of the births in the two years 1900 and 1901. Similarly with other years : so that we conclude that the deaths under one year of age for the ten years 1901 to 1910 arose out of an exposed to risk made up of half the births in 1900, all the births in 1901 to 1909 inclusive, and half the births in 1910. Now let us turn to age 1. Among the children born in any year, some will die before reaching age 1, and we must therefore deduct these in finding the exposed to risk for that age. Otherwise the method for finding the exposed to risk at age 1 is the same as that for age 0, except that the births must be taken for one year earlier. Consequently we reach the following as the exposed to risk at age 1, correspond- ing to the deaths at age 1 during 1901 to 1910: one-half of the births in 1899, all the births in 1900 to 1908 inclusive, and one-half of the births in 1909, less the deaths under age 1 in 1900 to 1909. When we come to the next age, deductions must be made for the deaths of two ages, and so on. The result may be set out as follows: CENSUSES AND DEATH REGISTERS 67 Exposed to risk corresponding to deaths during 1901 . to 1910 Age. 0. ^ births in 1900 + births in 1901 to 1909 +|- births in 1910. 1. ^births in 1899+births in 1900 to 1908 +^ births in 1909 deaths at age 0 in 1900 to 1909. 2. ^births in 1898+births in 1899 to 1907 +^- births in 1908 deaths at age 0 in 1899 to 1908 -deaths at age 1 in 1900 to 1909. 3. ^ births in 1897 + births in 1898 to 1906 +^ births in 1907 deaths at age 0 in 1898 to 1907-deaths at age 1 in 1899 to 1908deaths at age 2 in 1900 to 1909. 4. ^-births in 1896+births in 1897 to 1905 +^ births in 1906deaths at age 0 in 1897 to 1906-deaths at age 1 in 1898 to 1907deaths at age 2 in 1899 to 19 08-deaths at age 3 in 1900 to 1909. The exposed to risk found in this way never agrees exactly with the population given by the census. This is partly because the exposed to risk gives the number at an exact age, while the population gives the number for all agesin other words, the exposed is, as we have already seen, equal to the population with half the deaths added. We must therefore in the first place compare the total exposed to risk at these five ages, less half the deaths, with ten times the mean 68 MORTALITY AND SICKNESS TABLES population found from the census figures. They will be unlikely to agree owing to the population not being stationary, emigration, immigration, etc. Since the deaths and population must correspond, however, the figures found above have to be modified to bring them into line, and this is done by increasing or de- creasing each exposed to risk in the same proportion, so that their total after adjustment will be the same as the mean population under age 5 in the census returns increased by half the deaths. The deaths during the census period at each age divided by the exposed to risk so modified will give the rates of mortality. This completes the calculation of the rates of mortality from census data, and we may conclude the chapter by summarising the method adopted: 1. Set out the population from two censuses and the deaths for the period between the two censuses in the age groups available. 2. Calculate the mean population for the total population. 3. Calculate the mean population for each age group, so that the mean population for the total populations found in (2) is re- produced. 4. Spread out the mean population and the deaths so as to give particulars for each age. 5. Calculate the rate of mortality at each age by assuming that it is equal to the deaths divided by mean population plus half the deaths. CENSUSES AND DEATH REGISTERS 69 6. For the early ages, instead of using (4) and (5), calculate the exposed to risk by trac- ing the children born, and adjust the results to make them agree with the mean population. The rate of mortality is the deaths divided by adjusted exposed to risk. An alternative method is sometimes used with satisfactory results when the population is not subject to violent fluctuations, and consists of working on a single census, and the deaths for, say, three years (see p. 54).