You are reading a page from The Construction of Mortality and Sickness Tables, A Primer, W. Paline Elderton, Richard C. Fippard (1914)
Part of the American Term Life Insurance History Project
Term Life Insurance

                CHAPTER VI
 MORTALITY TABLES FROM CENSUSES AND
              DEATH REGISTERS
IN the preceding chapters we have dealt with the
treatment of data from which we can obtain the
history of each individual case, and the approxima-
tions resorted to for the sake of convenience may be
relied upon to give very accurate results.
  
We may now consider briefly the problem of
obtaining the rate of mortality from census records
and mortality returns where it is impracticable to
observe individual cases.  We must still keep clearly
before our minds that our final object is to obtain
the rates of mortality at each age, and as we cannot
obtain exactly the elements in our formula we must
endeavour to find a suitable basis for approximation.
For this purpose we may go back to our definition in
Chapter I and consider the numerical example in
which we imagined 10,000 persons aged 30, each of
whom was kept in sight until he attained age 31 or
until death, if it occurred before 31.   We concluded
that if there were 104 deaths the rate of mortality
at  age 30  was  -0104.   If  these  deaths  occurred
regularly during the year we could dispense with the
enumeration of  the  people at the outset, as by
counting those surviving to age 31 or to any age
                        
62
    
CENSUSES AND DEATH REGISTERS  53
between 30 and 31, and adding the deaths after age
30 up to that time we should obtain the number
commencing at age 30.  Let us assume that the
figures related  to  the calendar year 1912, and see
what would have been the position of the community
of 10,000 persons on 1st July when 6 months of the
year  under  consideration  had  elapsed.  Fifty-two
persons would have died, and there would remain a
population of  9948  persons aged  30^  years.  A
census taken on that date would give this figure, and
we could deduce the exposed to risk by adding 6
months' deaths, i.e. 9948+52 = 10,000, and the rate
of  mortality  would  be  104—10,000 =-0104  as
before.   This  shows  us  that  if  we  are  given the
number of persons aged 30^ alive on 1st July in
any year, and the number of deaths during the same
year at age 30 last birthday, we can find the rate of
mortality.
  
In practice, of course, it is quite impossible to
trace a number of people, all born on 1st January
in the same year, but if we take the population aged
30 last birthday on 1st July and the deaths aged 30
last birthday during the year, we can say that on the
average the birthday will be 1st January both for
population and deaths, and our results will be very
close to the  truth.   Similarly,  we  cannot  find  a
stationary population, but we may assume that the
population in the  middle of the year will be the
average population living during the year; in other
words, we shall arrive at approximately the same
figure if we take the result of a census on 1st July
as  if  we  traced  the  members  of  the  population
54 MORTALITY AND SICKNESS TABLES
individually and made allowance for the time they
were members of the community the mortality of
which we are investigating.  By taking the population
on 1st July instead of 1st January we should thus
in ordinary circumstances overcome the difficulty of
finding a number of persons born on the same day,
and we should also make allowance for variations in
the population during the year due to migration and
increase or decrease consequent on a changing birth-
rate in former years.
  
Put generally, then, the rate of mortality at any
age is equal to the number of deaths in one year
divided by the population at the middle of the year
increased by half these deaths, the people included in
each count being those who attained the age in
question on their last birthday.
  
The method we have outlined, although making
some allowance for variations, assumes that there has
been no violent fluctuation in the number of people
in the community, or that variations before and after
the middle of the year when the census is taken
counterbalance.   In such circumstances the method
will give good results, but it is awkward to apply in
practice, because censuses are not made yearly on 1st
July, and it is preferable to average the deaths over
a number of years to avoid fluctuations due to
epidemics, etc.  These difficulties can, however, be
overcome to a very great extent if the population is
not subject to erratic variations by using one census
and the average number of deaths for three years.  This
simple method often gives satisfactory results, but an
alternative method, which has been more commonly
   
CENSUSES AND DEATH REGISTERS  55
used of recent years, is to take the mean population
based on two censuses for the population in the
formula for the rate of mortality, and the deaths used
are those occurring during the period between the two
censuses.  The reader will see at once that by using
more than one census it is easier to decide whether
the population is changing, and to what extent.  An
extreme case will show the danger of using only one
census: Consider, for instance, a newly populated
district in some out-of-the-way part of the world which
was unpopulated up to July in a certain year, when
immigration started, and 100 people settled in the
district, and before the end of the year two of them
died.   If  a  single  census  had  been taken  at  the
middle of the year, it would appear that there was
no population, and yet two deaths would have been
recorded 1  This is, of course, a ridiculously extreme
case, but the use of two censuses and the 'deaths for
the intervening period would save one from a bad
error even here.
  Our next problem, then, is to find the mean
population during the period between the dates of
the two censuses, and we must first see exactly what
we understand by this term.  It is simply the average
of  all the populations  there have been  during the
period.   If we want to find the batting average of
a cricketer we add together all the runs he has made
and divide the result by the number of completed
innings.   In the same way we should get our mean
population by adding together all the populations and
dividing by the number of enumerations.  Unfortun-
ately, however, while a cricket innings is complete in
56  MORTALITY AND SICKNESS TABLES
itself, and  is divided by well-marked intervals from
other innings, a population is in a state of constant
change and varies from day to day on account of
deaths and removals.  In order to obtain an exact
result, therefore, we  should  have  to  take a census
every moment, whereas in actual practice the census
—at least in this country—is taken only once in ten
years!  We are thus faced  with the difficulty of
deciding from the results of two censuses, separated
by an interval of ten years, what has been the popula-
tion at different times within the ten years, or, which
is practically the same  thing, in what  manner  the
population has varied.
  
It  will be  seen  at  once  how awkward this is:
because it would be very difficult to guess a mean
population correctly from these two censuses alone.
Some idea might perhaps be obtained from the deaths,
because the deaths depend on the population; but in
practice the helpfulness of the deaths is discounted
to a large extent by their fluctuations, which disguise
variations due merely to the changes in the population.
  
The following diagram may help to make the
position clearer:—


    
CENSUSES AND DEATH REGISTERS  57
A D and B C represent the population of a community
in 1901 and 1911 respectively, and we want to find
the average population during the period.  Another
way of setting the problem is to say that we know
A D and B C and we want to know the area of the
whole block A B C D, but do not know how the line A B
ought to be drawn.  We could draw it in so many
ways.  We could make it a straight line ; or a curve,
running either above or below the straight line; or
we could make it twist about in all sorts of ways.
A few of the many possible solutions are shown in
the diagram by dotted lines.  Think how different
the area is in almost every case, and how unlikely
we should be to guess the right distribution of
population which is shown  by the complete line.
The problem would be very much easier if we were
given E F—the population in 19 0 6—but unfortunately
this is not available.
  
The result of this difficulty has been that people
have " theorised," and the theory was set out that
the population increased in geometrical progression.
This theory seems to rest on the idea that population
begets population, and that therefore if the population
is  100  this year and double this number next year,
it  will again  double  itself  the  next year,  and so on.
The reader must remember that this is simply
assumption, but it is convenient in so far as it enables
us to find the population at any moment, and so obtain
the sum of all the populations without much trouble.
In some cases where it can be tested it seems not
unreasonable.  It can be very far out in other cases,
and a decrease in marriage-rate, postponement of age of
58  MORTALITY AND SICKNESS TABLES
marriage, a changing death-rate, or many other causes,
could upset the geometrical progression.  This can be
shown by taking a somewhat extreme case.  There
are certain populations which are now decreasing,
although at one time they were increasing.  At some
time between say 1903 and  1913 the population
changed from an increasing one to a decreasing one.
The population was in fact like the top line of our
figure, and the geometrical progression which is below
the  straight line  gives a worse  result than the
straight   line   which  represents  an  arithmetical
progression.
  
A geometrical rate of increase is, however, fre-
quently assumed, and the consequent mean population
     
y. _ ^
is P .  ——x(-434  .  .  .),  where P is  the  population
     
logr
at the first census, rP the population at the second
census, and the logarithm is the common logarithm to
base 10.1
  
If,  however,  instead  of  assuming  a  geometrical
progression it is assumed that the population was
increasing in arithmetical progression, the mean is
one-half the sum of  the populations at the two
censuses, and in  cases in which  the assumption  is
correct  this  is  the  same as the population at the
middle of the term.
  
In order to  test the effect of  these different
  1 The formula is obtained by the application of the integral
calculus.   On the given assumptions the population at any time
is Pr', and the mean  population during the  ten years is therefore
fy^-^r
   
CENSUSES AND DEATH REGISTERS 59
assumptions let us consider the case of a population
of 10,000 on 1st January 1901 which had increased to
20,000 by 1st January 1911.  The mean population
would be 14,428 on the assumption of a geometrical
progression and  15,000  on the assumption of an
arithmetical progression, and if  we  further assume
that  2250  deaths  occurred  during  the  ten  years
1901-1910 the corresponding rates of mortality
would be '0155 and '0149 respectively, showing a
difference of 4 per cent.   In calculating these rates
we must take one-tenth of the 2250 deaths, because
these deaths relate to ten years.  This is an extreme
example, of course, but it shows that the assump-
tions made may have considerable influence on our
results.
  
So far we have assumed that we are dealing only
with one age, but in practice, as the published census
results show the numbers living between certain ages
and not the number at each age, we have to calculate
mean populations for these groups as well as for the
total population, and we have afterwards  to  spread
out the data so that rates of mortality can be found
for every age.   It will seem at first sight that there
is little difficulty  in  the  first  point, because  we  can
calculate the mean population for each group inde-
pendently ;  add  up these  mean  populations  and
treat the total as the mean for the population as a
whole.
  
If  we assume  that the  population  in  each age
group is varying in arithmetical progression, then
the sum of the separate mean populations obtained
for the different groups will be equal to the mean
60  MORTALITY AND SICKNESS TABLES
population found from the total number living at the
two census dates.
  If, however, we  work  on the basis of a  population
increasing in geometrical progression, we shall find
that unless the rate of increase in every group is the
same, we shall not necessarily obtain this equality.
This result is not only to be expected when the
nature of the assumption made is considered, and
does not, in itself, condemn the method of taking the
geometrical  mean population  of  each  age  group
separately.   It has, however, been considered by some
a defect in the method, and in the latest Population
Tables in this country the assumption of a geometrical
progression has been made with regard to the total
population only, the mean population for the different
age groups being obtained by assuming that the pro-
portion of the population in any group to the total varies
in arithmetical progression.   It can be seen that by
this device the objection is  overcome; and further,
that if we work on the ratios in this manner we shall
always get equality between the sum of the mean
grouped populations and the mean of  the whole
population, no matter what assumption we make as
to the manner in which the population as a whole
is varying.
  
Having thus calculated the mean population for
the groups of ages given in the census returns, we have
now to spread out the results in some way or other
so as to obtain deaths and populations for each age on
which to base our rates of mortality.
  
Table XVII gives the facts for a small portion of
a certain experience.
CENSUSES AND DEATH REGISTERS 61

TABLE XVII

  
The data are shown graphically on a small scale
in Figs. A and B.  The rectangular blocks are propor-

Age 30   40    50   60   70   80   90   100
                 
FIG. A.—POPULATIONS.
tional to the populations and deaths, and the lines
running through them enable us to distribute the

62  MORTALITY AND SICKNESS TABLES
figures  over  the ages  in  the groups.   As we  are
merely distributing we must see that the line drawn
does not lessen the size of any block—in other words,
the population read off from figure A for any one of
the original groups will be the same whether we use
the curved line or the straight one, but the population


Age 30

60      70      80      90
FIG. B.—DEATHS.

for any smaller age group will differ.   The objections
to the method are (1) that it is quite possible to draw
many lines satisfying the necessary condition, and
each line gives a different distribution and different
values for the rates of mortality; (2) that it is quite
likely that the figures obtained will not run very
smoothly, and may require some adjustment after-
wards if the results are to be useful in practical cal-
culations ; (3) that it is not easy to read off the areas
    
CENSUSES AND DEATH REGISTERS 63
from the curve with sufficient accuracy unless the
drawing is on a very large scale.  The method is,
however, so simple that it has often been used;
sometimes with considerable success.  The Carlisle
table was calculated on these lines by Joshua Milne
in 1815, and the method is sometimes called by his
name.
  
Another method is to assume that the rate of
mortality deduced from the group, is the rate for the
central age of the group and to obtain other values
by interpolation.  This method is short and easy, but
it  is  objectionable  because  it  introduces  systematic
errors.    These  errors  arise  because  the  mortality
increases with the age, and the rate of mortality for
the group does not therefore give the rate for the
central age of the group.
  
The method generally adopted is to interpolate by
means of one or other of various algebraical methods
that are available  between  the figures given for
populations and deaths, and so produce figures that
can be used for individual ages.
  
Two alternatives are available in the application of
the formula chosen.  We may either split up the
population in each age group to find the number at
each age, or we can find the total population over
certain ages by summation of the grouped figures,
interpolate to find the population over each age, and
then, by differencing the results thus obtained, arrive
immediately at the number living at each age.  Thus,
in the example given on page 61 we can find the popu-
lation at age 52 by dividing 5012 into  10 parts in
some manner, or by summing the figures in column
64  MORTALITY AND SICKNESS TABLES
2 to obtain the total population over 50, over 60,
over 70, etc., interpolating to find the total population
over 51, over 52, over 53, etc., and taking the differ-
ence between the total population over 52 and the
total  population  over  53.   The  latter  method  is
more convenient and, with the modification described
below, is usually employed.   It might also be adopted
with the graphical method referred to above, but its
utility is discounted by the immense scale on which
the work has to be done to enable the calculator to
read off from the graph the additional figures in-
volved.   On the other hand, it would simplify the
reading from the graph, because we should read off
the heights of the curve at each age instead of calcu-
lating the areas.
  
We may save a little more work yet in arranging
the function for interpolation.  The reader will re-
member that the rate of mortality or chance of dying
in a year is—
                      
Deaths
               Population + ^ deaths,
the chance of living a year is therefore the difference
between this and unity, namely—
               
Population — ^ deaths
               Population + ^ deaths
  One advantage of this second form over the first
is  that  the  numerator  and  denominator  are  more
alike,  and  the  same  method  of  interpolation  can
therefore more properly be used for the two functions;
another advantage  is  that  a  certain amount  of
   
CENSUSES AND DEATH REGISTERS  65
arithmetic is saved because we only calculate " Popu-
lation + ^ deaths " and " Population — ^ deaths " for
a few age groups, and the interpolation and differenc-
ing then give the numerator and denominator for
finding the probability of living a year without any
further calculations.  If  we work with the deaths
and populations we have afterwards to calculate the
" Population + ^ deaths " for each age, which entails
over 100 calculations.
  
Even the process of interpolation is not altogether
easy, as we have to complete a long series of values,
and it would be impracticable to use all the given
terms for this work.  Interpolations are therefore
made from various selections of terms, and the final
results  are  artificially  blended,  or,  preferably,  the
ordinary methods of interpolation are discarded and
a method is used which, though a little artificial, gives
continuity without the trouble of any independent
blending process.
  
The above methods give good results for the main
portion of a mortality table, but they are unsuitable
for the first few years of life.   During infancy and
childhood the rate of mortality changes very rapidly,
and it is found that the ages of children are frequently
stated inaccurately, so that it is practically impossible
to adopt satisfactorily any method of spreading the
facts recorded, for the first groups of ages.   The ex-
posed to risk have consequently to be calculated from
the recorded births and deaths.  We will assume
that the deaths at each age are available for the
ten years  1901  to  1910, and that they can be
obtained for years prior to 1901 if required.   Of the
      5
66  MORTALITY AND SICKNESS TABLES
deaths at age 0, i.e. those occurring before age 1,
among children born in the year 1900, some will
have occurred in 1900 and the remainder in 1901;
and of the deaths at the same age among children
born in 1901, some will have occurred in that year
and the rest in 1902.   Approximately, then, the
deaths occurring in 1901 at age 0 must have been
one-half of the total deaths under age 1  among
children born in 1900 and 190.1, or we may say that
the exposed to risk corresponding to the deaths at
age 0 in 1901 will be one-half of the births in the
two years 1900 and 1901.
  
Similarly with other years : so that we conclude
that the deaths under one year of age for the ten
years 1901 to 1910 arose out of an exposed to risk
made up of half the births in 1900, all the births in
1901 to 1909 inclusive, and half the births in 1910.
  
Now let us turn to age 1. Among the children
born in any year, some will die before reaching age 1,
and we must therefore deduct these in finding the
exposed to risk for that age.  Otherwise the method
for finding the exposed to risk at age 1 is the same
as that for age 0, except that the births must be taken
for one year earlier.    Consequently  we reach  the
following as the exposed to risk at age 1, correspond-
ing to the deaths at age 1 during 1901 to 1910:
one-half of the births in 1899, all the births in 1900
to 1908 inclusive, and one-half of the births in 1909,
less the deaths under age 1 in 1900 to 1909.
  
When we come to the next age, deductions must
be made for the deaths of two ages, and so on.  The
result may be set out as follows:—
    
CENSUSES AND DEATH REGISTERS 67
  Exposed to risk corresponding to deaths during 1901
 .                         to 1910
Age.
 0.      ^ births in  1900 + births in 1901 to 1909
            +|- births in 1910.
 1.      ^births in 1899+births in  1900  to 1908
            +^ births in 1909 —deaths at age 0 in
           1900 to 1909.
 2.      ^births in  1898+births in 1899 to 1907
           +^- births in 1908 —deaths at age 0 in
           1899 to 1908 -deaths at age 1 in 1900
           to 1909.
 3.      ^ births in  1897 + births in  1898 to 1906
           +^ births in 1907 —deaths at age 0 in
           1898 to 1907-deaths at age 1 in 1899
           to 1908—deaths at age 2 in 1900 to
           1909.
 4.      ^-births in 1896+births in  1897 to 1905
           +^ births in 1906—deaths at age 0 in
           1897 to 1906-deaths at age 1 in 1898
           to  1907—deaths  at age 2 in 1899 to
           19 08-deaths at age 3 in 1900 to 1909.
  The exposed to risk found in this way never agrees
exactly with  the population given by  the census.
This is partly because the exposed to risk gives the
number at an exact age, while the population gives the
number for all ages—in other words, the exposed is,
as we have already seen, equal to the population with
half the deaths added.   We must  therefore in  the
first place compare the total exposed to risk at these
five ages, less half the deaths, with ten times the mean
68  MORTALITY AND SICKNESS TABLES
population found from the census figures.  They will
be unlikely to agree owing to the population not
being stationary, emigration, immigration, etc.  Since
the deaths and population must correspond, however,
the figures found above have to be modified to bring
them into line, and this is done by increasing or de-
creasing each exposed to risk in the same proportion,
so that their total after adjustment will be the same
as the mean population under age 5 in the census
returns increased by half the deaths.  The deaths
during the census period at each age divided by the
exposed to risk so modified will give the rates of
mortality.
  
This completes the calculation of the rates of
mortality from census data, and we may conclude the
chapter by summarising the method adopted:—
     
1. Set out  the population  from  two censuses
           and the deaths for the period between the
           two censuses in the age groups available.
     2.  Calculate the mean population for the total
           population.
     3. Calculate the mean population for each age
           group, so that the mean population for
            the total populations found in (2) is re-
            produced.
     4.  Spread  out  the  mean  population  and  the
            deaths so as to give particulars for each
            age.
     5.  Calculate the rate of mortality at each age
            by assuming that it is equal to the deaths
            divided by mean population plus half the
            deaths.
   
CENSUSES AND DEATH REGISTERS  69
    6. For the early ages, instead of using (4) and
          (5), calculate the exposed to risk by trac-
          ing the children born, and adjust the
          results  to  make  them  agree  with  the
          mean population.  The rate of mortality
          is the deaths divided by adjusted exposed
          to risk.
  An alternative method is sometimes used with
satisfactory results when the population is not subject
to violent fluctuations, and consists of working on a
single census, and the deaths for, say, three years
(see p. 54).