You are reading a page from An Essay on Probabilities and their Application to Life Contingencies and Insurance Offices, Augustus de Morgan (1838)
Part of the American Term Life Insurance History Project
Term Life Insurance
ON eRRORS OF OBSERVATION.    129
have not been accurate, it is still reasonably probable that the average of their guesses would be nearly true ; the limits of error would certainly be larger. It is the object of the present chapter to show how the theory of probabilities must be applied to the detection of the most probable result, when various observations are discordant with each other.
Error, as used in this part of the subject, merely means discordance of which the cause is unknown. In the different branches of physics, in their application to the arts, &c. &c., that which we signify by the pre-ceding word may arise from various causes, the chief of which are here enumerated.
  1. From some law of nature not known to the observer. Thus before the discovery of the aberration of light, all the small yearly changes which the places of stars receive from that cause, only appeared in the form of embarrassing differences between the observof different months. Those who used astroinstruments might suspect the existence of some unknown motion in the heavenly bodies; they might think it extremely improbable that their improvein the art of observing should permit purely casual discrepancies of so large an amount as those which occurred : but still, so long as no account of the magnitude of these possible results of law could be given, those who observed could in no manner dis_ tinguish them from the imperfections of the instruor of the human senses. But had it been shown that these discrepancies were always the same at the same time of the year, for any one star, they would then have ceased to be errors, and would have become the objects of prediction, as soon as one year's observhad been registered. The physical cause might or might not have been subsequently discovered, without altering the state of the question : the certainty of a phenomenon is all that is required to remove it from the domain of probability.
  2. From the personal constitution of the observer ;
  3. 130ESSAY ON PROBABILITIeS.by which is meant, not that general facility of misperception which is common to all the human race, but that particular habit or temperament Which causes some to differ from persons in general in their method of perceiving. Thus it will frequently happen that when two observers note the time of a phenomenon, by the same watch, one will always see the event, or imagine he sees it, before the other does the same. This personal error, which is seldom large, is beginning to receive the attention of observers. It is not perceptible as long as the natural data of a science remain imperknown, being mixed up and lost in errors of greater magnitude; but it produces discoverable effects so soon as the science approaches towards accuracy.
  4. From fixed sources of error peculiar either to the species of apparatus employed, or to the individual instrument with which the observations are made. This answers precisely to the personal error of the observer in its effects : it matters nothing whether the clock be one second too fast, or the observer, in the result of the observation.
  5. From the imperfection of human senses and inTo note a measurable phenomenon without any error at all, would require sight and touch by which every magnitude, however small, could be perceived and correctly estimated. Such senses belong to no one, and the degree of approach towards perfection not only varies with the observer, but is different at different times with the same observer. Many errors to which instruments are subject ought in strictness to be classed under the first head ; if, for instance, an astronomical circle gradually change its form, or undergo daily expansion and contraction by variations of temperature, the diversity of results which such a piece of brass will shew are certainly subject to laws, and might be predicted, if we possessed sufficient knowledge of the constitution of the metal, and the laws which regulate the effect of pressure, temperature, moisture, &c. upon it. But so long as such laws are
  6. ON eRRORS OF OBSERVATION.181unknown, and the variations do not follow any distinguishable rule, their effect upon general results differs in nothing perceptible from that of the observer's own errors, with which they are mixed up in the parresults of observation.Before any trials are made, that is, before any thing is known of the character of the observer, of the instruhe uses, or the perceptibility of the phenomenon, we can have no reason to suppose that any one observis more likely to exceed the truth than to fall short of it. When any observation is greater than the reality, the error is called positive; when less, negative. The hypothesis, therefore, of an equal presumption for positive and negative errors, is one with which we must commence ; and it follows from the supposition that the average is the most probable result of a number of discordant observations. The sum of all the observations will be affected by the balance of all the errors, but will be without error itself if the amount of the positive errors be equal to that of the negative ones. This last supposition, though not probable in itself, is nevermore probable than any other, and the odds are very much in favour of its being nearly true. Now whatever may be the error of the sum of observations, say 100 in number, the average, or the hundredth part of the sum, contains only the hundredth part of that error; and the presumption that such an average is very near indeed to the truth, greatly exceeds the probability in favour of any one of the observations.But before we proceed further, it becomes necessary to ask what laws of error can be absolutely determined, and shown in the nature of things to exist? And first, what do we mean by a law of error? Let A B be a length to be measured * or estimated, subject to error of observers and instruments ; and let the greatest possible errors be B K (negative), and B L (positive) ; so that the result of measurement may be any thing* The second figure is an enlargement of part of the first.R 2132eSSAY ON PROBABILITIES.between A K and A L. Let K B and B L be equal, and suppose positive and negative errors equally likely
The probability of any one measurement giving exactly any predicted result, say A N, must be in-appreciably small ; since A N is only one out of an infinite number of possible cases. But take any point V, however near to N, and the chance that the result
c
Nv
of an observation shall lie between AN and AV is capable of being imagined to be finite, though small. Whatever the law of errors may be, let a curve KCL be so drawn that the chance of something between AN and AV shall be the proportion which the area N P QV is of the whole area KCL. Or if we call the whole area 1, the area R P N IV, for example, is that fraction which expresses the chance of a result lying between AM/. and AV. The symmetry of the curve on the two sides of CB is an expression of the hypo-thesis of positive and negative errors being equally likely : and the approach of C K and CL towards the axis is equally an expression of the supposition that large errors are not so likely as small ones. If we wish

ON ERRORS OF OBSERVATION.    133
to express that errors of all magnitudes are very nearly equally likely, we draw such a curve as the upper dotted line : while, if we wish to express that the preare very strong for each measure being very nearly true, we describe the lower dotted line. We can thus figure to the eye a representation of the law of errors, be it what it may ; and the description of a law of error is that of a curve.
Generally speaking, it is impossible to commit an error of more than a certain magnitude : but this ciris one which embarrasses the mathematical treatment of the subject exceedingly. It is practically the same thing to consider any error, however great, as possible, but errors of more than a certain magnitude as extremely improbable. If, for instance, a case should arise in which an error of more than an inch is imposlet it be agreed to consider such an error as not impossible, but so improbable that there shall not be an even chance of its happening once in a million of times. The curve which exhibits the law of error must then be of the following kind, never meeting the axis, but continually approaching towards it, so that the whole of its area from and after L, is incomparably small by the side of the area C B L.

The curve here drawn is something like that in p. 17., and we might suspect from the utility of the latter and of the tables derived from it, that it should play an important part in the present subject. This we shall presently see : in the mean while I proceed to explain the sense in which I use the terms average balance, average error, mean risk of error and probable error, to which I direct the reader's particular at
x 3
134    ESSaY ON PROBABILITIeS.
tention. When we consider the errors of different kinds as balancing each other, it follows that positive and negative errors being equally likely, the balance of all the errors will be trivial in a very great number of observations. The average of all the errors will be extremely small: and, in the long run, nothing. In this instance then, it is said that the average balance of error is nothing. But if positive and negative errors were not equally likely, the more probable class would, in the long run, predominate, and the average balance of error would be of a definite magnitude, positive when positive error is more likely than negative, and vice versa,
The preceding has nothing to do with the average of the absolute magnitudes of errors, considered without reference to the distinction of positive and negative. For example, whether the greatest error may be a mile or an inch, it is equally true that the long run will establish a compensation, when positive and negative errors are equally likely. But in the former case, the average magnitude of the errors which occur will, cater is paribus, as much exceed that in the latter, as a mile does an inch. This average magnitude of errors, independently of sign, is an important element of the whole question, because a tolerably probable estimation of its value can be found from the observations. Suppose, for instance, that fifteen observations or estimations gave the followresults.*
722 1311    967 1309
933 1089 1344 858
1033    972 1250 1029
917 1294 744
The average of these is 1051, and assuming this as the true result, the errors are,
329 260    84 258
118    38    293    193
18    79    199    22
134 243 307
* These are not numbers written at hazard, but actual results of estimation, on a subject which it is not here necessary to explain.
ON ERRORS OF OBSERVATION.    135
The average of the errors is 172 ; subject of course to the supposition that the average of the observations is nearly true. But the error of the last assumption must be considerable before it can much affect the present result.
The average of all the errors, taken without reference to sign, will, in the present hypothesis, be the same thing in the long run as the average of positive error, or the average of negative error. The reason is, that the number of positive and negative errors will in the long run be equal, and also their sums. If, in a very large number of observations, there be s positive and s negative errors, and if the sum of each set be S, it is evident that the sth part of S (which is the average positive error, or the average negative error), is the same as the 2sth part of 2S (which is the average error without reference to sign). But it is customary to prefer to the average error another function of the errors, which may be called the mean risk of error, and which differs from the average in the following manner. Every error, positive or negative, is an increase or diminution of the final result ; just as every game won or lost at gambling is an increase or diminution of the stock of the player. On precisely the same principles as those explained in chapter V, I may consider an even chance of an' error 2 as a thing to be compounded for by a certainty of an error 1. If, in a large number of observations 2s, there will come a sum of positive errors equal to S, and the same of negative errors, and if, as in taking the average, the sum of the errors be the only material point, it may be considered that every observation will have either a positive error or a negative error, of the value of the sth part of S. The results of such a supposition will, in the long run, and so far as the sum of errors is concerned, represent the actual case under consideration. If, therefore, a person could compound for positive errors alone, leaving the negative ones to chance, he must suppose every observation to have the half of S = s, or S = 2s of positive error,
x 4
134    eSSaY ON PROBABILITIeS.
tention. When we consider the errors of different kinds as balancing each other, it follows that positive and negative errors being equally likely, the balance of all the errors will be trivial in a very great number of observations. The average of all the errors will be extremely small: and, in the long run, nothing. In this instance then, it is said that the average balance of error is nothing. But if positive and negative errors were not equally likely, the more probable class would, in the long run, predominate, and the average balance of error would be of a definite magnitude, positive when positive error is more likely than negative, and vice versa.
The preceding has nothing to do with the average of the absolute magnitudes of errors, considered without reference to the distinction of positive and negative. For example, whether the greatest error may be a mile or an inch, it is equally true that the long run will establish a compensation, when positive and negative errors are equally likely. But in the former case, the average magnitude of the errors which occur will, cceteris paribus, as much exceed that in the latter, as a mile does an inch. This average magnitude of errors, independently of sign, is an important element of the whole question, because a tolerably probable estimation of its value can be found from the observations. Suppose, for instance, that fifteen observations or estimations gave the followresults.*
722 1311    967 1309
933 1089 1344 858
1033    972 1250 1029
917 1294 744
The average of these is 1051, and assuming this as the true result, the errors are,
329 260    84 258
118    38    293    193
18    79    199    22
134 243 307
* These are not numbers written at hazard, but actual results of estimation, on a subject which it is not here necessary to explain.
ON ERRORS OF OBSERVATION.    135
The average of the errors is 172 ; subject of course to the supposition that the average of the observations is nearly true. But the error of the last assumption must be considerable before it can much affect the present result.
The average of all the errors, taken without reference to sign, will, in the present hypothesis, be the same thing in the long run as the average of positive error, or the average of negative error. The reason is, that the number of positive and negative errors will in the long run be equal, and also their sums. If, in a very large number of observations, there be s positive and s negative errors, and if the sum of each set be S, it is evident that the sth part of S (which is the average positive error, or the average negative error), is the same as the 2sth part of 2S (which is the average error without reference to sign). But it is customary to prefer to the average error another function of the errors, which may be called the mean risk of error, and which differs from the average in the following manner. Every error, positive or negative, is an increase or diminution of the final result ; just as every game won or lost at gambling is an increase or diminution of the stock of the player. On precisely the same principles as those explained in chapter V, I may conan even chance of an' error 2 as a thing to be compounded for by a certainty of an error 1. if, in a large number of observations 2s, there will come a sum of positive errors equal to S, and the same of negative errors, and if, as in taking the average, the sum of the errors be the only material point, it may be considered that every observation will have either a positive error or a negative error, of the value of the sth part of S. The results of such a supposition will, in the long run, and so far as the sum of errors is concerned, represent the actual case under consideration. If, therefore, a person could compound for positive errors alone, leaving the negative ones to chance, he must suppose every observation to have the half of S    s, or S = 2s of positive error,
K 4
136    ESSaY ON PROBABILITIES.'
combined with such a negative error as chance may yield. That is, he must suppose all the negative errors altered by the introduction of such an additional positive error, and each of the positive errors increased or reduced to S 2s. And the same if he would compound for negative errors only: while, to compound for both, he must suppose every observation affected
both by a positive error of S 2s and by a negative error of the same amount. This latter case supposes every observation to be correct, which is the result in the long run. The use of this consideration is, to keep before the mind the average effect of positive error, not upon those observations which hare positive error, but upon all the observations; and the same for negative errors. Suppose I have an instrument which makes positive and negative errors in equal numbers and to equal amounts, in the long run ; and suppose that it is in my power totally to destroy positive error, leaving the chance of negative error as before. What is the effect upon the whole system of errors, positive and negative, one with another? What must I do with all the errors to reproduce the same amount of absolute error as before ? I must affect every observation with one half of the average amount of positive error, or one half the magnitude which the positive errors have, one with another. Or look at it in this way ; if I have to pay a shilling for every unit of positive error, for how much should another person take the risk off my hands, that is, for how much per observation, whether its error be positive or negative. If the average positive error were 1, I should have in the long run to pay at the rate of one shilling for every two observations, against which I should insure at the rate of sixpence per observation.
The mean risk of positive error, then, is the average positive error, when the errors of this kind are equally distributed over all the observations : and the same for negative error. When positive and negative error are
ON eRRORS OP OBSeRVATION.    137
equally likely, each is one half the average error, considered without reference to sign.
By the probable error I mean that amount of error which is such, that there is an even chance for exceeding or falling short of it. Thus if it be 1 to 1 that the error shall lie between 0 and 10, and of course the same that it shall exceed 10, then 10 is called the probable error. For any number greater than 10, the chances are (no matter how little) in favour of the error being within that number ; for any thing less than 10, the chances are against the error falling within that amount.
PROBLEM. The number of observations being n, and positive and negative errors being equally likely, required the probability that the average of the n observations lies within a given quantity k of the truth ; or, M being the average, that the truth lies within M + k and M — k.
RULE. (By Table I.) Take the average of the observations, find all the errors upon the supposition that the average is the true result, add together the squares of the errors, and divide the square of the number of observations by twice the sum of the squares of the errors. Call the result * the weight of the average. Multiply k by the square root of the weight; let the result be t ; then the H answering to t in table I. is the probability required.
RULE. (By Table II.) Find the weight as in the last rule, and divide 62 by 130 times the square root of the weight. The result is the probable error of the average. Divide k by the probable error, and let the quotient be t; then the K answering to tin table II. is the probability required.
ExAMPLE. In the preceding instance, what is the probability that the average 1051 lies within .50 of the truth. The squares of the errors are, 108241, 13924, 324, 17956, 67600, 1444, 6241, 59049, 7056,
* The reason of the appellation will be afterwards explained.
138    ESSAY ON PROBABILITIeS.
85849, 39601, 94249, 66564, 37249, 484, the sum of which is 605831 ; and twice this is 1211662. Divide 225, the square of 15, by 1211662, which gives 0001856953, the weight of the average. The square root of the weight is 013627, which multiplied by 50 gives 6814, the value of t : that of H is then -66. So that it is more than 3 to 2 that the true result of the preceding very discordant observations lies between 1001 and 1101
To use the second table, multiply 013627 by 130 which gives 1.77151, by which divide 62, giving 35 very nearly. This is the probable error, so that it is an even chance the result lies between 1051 — 35 and 1051 + 35, or 1016 and 1086. Divide 50 by 35 giving P43 the value of t ; for which in table II., K is 66, as before.
I now proceed to explain the meaning of the term weight, as used above. When an observer has made various observations, one or more of which he thinks superior to the rest, as to the favourableness of the circumstances under which they were made, it follows that the good observations should tell more in the formation of the most probable result than the indifferent ones. If for example, a remarkably good trial give 10 and an indifferent one 11, it is not reasonable to say that 10, is the most probable result. If the first observation be remarkably good, it may seem not unfair to give it the force of four observations, or to let the number 10 have the weight which would result from four observations giving 10, 10, 10, 10, a fifth giving 11. On this supposition the average is the fifth part of 51, or 101-, instead of 10!. This was called giving the observations 10 and 11 weights of 4 and 1, and the method of finding an average is this : multiply every observation by its weight and divide the sum of the products by the sum of the weights. Such a method was adopted before the theory of probabilities was applied to the subject, as a direction of common sense. When that theory came into use, it was found that the
ON ERRORS OF OBSERVATION.    139
square of the number of observations divided by twice the sum of the squares of the reputed errors (the average being reputed correct) ought to stand in the place of the weight in the preceding rule, whenever different averages are to be combined together to form one general average. if, for instance, one average of 100 observations gave 10 and another of 50 gave 11, and if the squares of 100 and 50 respectively divided by twice the sums of the squares of the errors gave 1.5 and 1.1, the most probable average of these averages would not be 10', but the product of 10 and 1.5 increased by that of 11 and P1, and divided by the sum of P5 and 1.1, which gives 10.4. Hence the term weight is now applied to the quotient above described.
'When the law of error is of the kind figured in p. 17., the mean risk of either sort of error, the probable error, and the weight of a single observation, are connected by very simple relations, as follows:
I. The mean risk is 200 divided by 709 times the square root of the weight; more nearly, 2820953 divided by the square root of the weight.
  • The probable error is 62 divided by 130 times the square root of the weight ; more nearly, 476936 divided by the square root of the weight.
  • The weight is 113 divided by 1420 times the square of the mean risk.
  • The probable error is 1 17,, of the mean risk ; more nearly 1.690694 of the mean risk.
  • The weight is 5 divided by 22 times the square of the probable error ; more nearly, 227468 divided by the square of the probable error.
  • The mean risk is j!' of the probable error; more nearly, .591473 multiplied by the probable error.
  • We can thus find the remaining two, when either of the three is given. Of the three, I apprehend that the probable error refers to the most instructive notion ; but the mean risk and the specific weight enter more usefully into formulae of calculation.The average140eSSAY ON PROBABILITIES.error, being twice the mean risk, is readily determined when wanted.Many persons confuse the average and the probable error in their own minds : that is, they imagine it to be as likely that any error should exceed the average as fall short of it. That such cannot be the case is evident from the following consideration.The average error depends upon the magnitude of the error, as well as upon the proportions in which errors of different magnitudes enter; the probable error depends only upon the latter. If, then, small errors enter in larger numbers than great ones, the probable error is rendered less than it would otherwise be. In determining the probable error, the error 100 entering once, counts no more than the error 1 entering once. But in the average error, the error 100, entering once, counts 100 times as much as the error 1 entering once. Consequently the former must be less than the latter. But whether the probable error exceed or fall short of the mean risk (half the average error), must depend on the law of error. In the present case the former considerably exceeds the latter.It may be asked whether the preceding results are always strictly true. Granting that the probability of an error diminishes with its magnitude, and that positive and negative errors are equally likely (which are the only hypotheses of the preceding question), does it necessarily follow, whatever may be the law of the diminution of probability, that the mean risk of error will, in the long run, be J of that error which is as often exceeded as not? The absolute answer to this is, that the assertion is not strictly true, except upon further suppositions as to the law of error. The manner in which this inconsistency is explained, depends upon whether the person asking the question be sup-posed to be an inquirer seeking methods of disciplining his judgment, or an experimental philosopher requiring only a sufficient practical rule for the treatment of a set of observations.ON eRRORS OF OBSERVATION.141To one who is looking for sound principles, I observe that he does not want in this matter the exposition of the consequences of any one law of facility of error, but an account of the general character of those laws to which common sense and daily experience assure him that his faculties and means of observation are subject. The facts of which he stands assured are, that the probability of error does not diminish very rapidly at first, but that as the error we consider grows larger, its probability does diminish very rapidly, and becomes insensibly small for errors of a certain magniand upwards. No curve of comparison, drawn in the manner described in p. 132., will be a true repreof what we know on this subject, unless it have the general form, of which the following varieties are instances. Now though the preceding results are not strictly true for every curve which has such a form, yet there is a class of curves of this form, some variety or other of which will approach tolerably close to any line which can be drawn to resemble one of those in the
figure. In every one of this standard class of curves, all the preceding relations are strictly true, and there-fore are nearly true for all the curves which resemble any one of the standard class. Thus though the average error may not always be 14 of the probable error, yet the former is always some fraction of the latter, not differing very greatly from +'4. There is another reason for the adoption of this law of error as a standard, for which the reader may consult the fourth appendix to this work.
142    ESSAY ON PROBABILITIES.
The experimenter, looking for a method of treating observations which shall produce trustworthy results, well knows that it matters nothing whether a method be true or false, if demonstration can be given that the consequences of the method are true. That falsehood necessarily produces falsehood is a fallacy, pardonable enough in everything but mathematics. True reasoning on true hypotheses must necessarily produce true results ; false reasoning, or false principles, or both, may, and most probably will, lead to false consequences, but may lead to the direct reverse. In every part of knowledge, except mathematics, error must be carefully avoided, because there is no method of distinguishing between the cases in which it leads to truth, and the contrary cases. But in the exact sciences, the knowledge of the consequences of falsehood and of those of truth are equally exact : and it is possible to introduce an erroneous addition to the conditions of a problem, to trace the consequences of such error, and to annihilate them at any part of the process. It is possible also to substitute for truth an erroneous supposition, in such manner that the effect of successive lapses of this kind shall be compensatory of each other, or so that the more often the error is repeated, the nearer is the result to the truth. The preceding case affords an instance: let the law of error be what it may (provided only that positive and negative errors are equally likely and that of two errors the larger is always the less probable), and let a moderately large number of observabe in question, and it follows that the results of the real law, and those of the preceding supposition, are nearly identical. Let the number of observations be still larger, and the resemblance is still nearer, and so on without limit. And this is true, even when the law of error, as regards a single observation, or two or three observations, varies to a large amount from that which is used above. Consequently, for a tolerable number of observations, it is absolutely indifferent whether the real law of error be known, or whether
ON eRRORS OF OBSeRVATION.    143
the nearest variety of the class under consideration be substituted for it.
Having then asserted, as a result of investigation, the existence of a standard law of facility of error which not only represents or resembles the impressions which unassisted reason would form a priori, but the results of which are more than sufficient mathematical approxito truth, whatever (with some easily admissible limitations) the law of error may be, I proceed to describe it more particularly, calling it in future the standard law of facility of error. The sole datum necessary for its specific application, is either of the three, the weight of an observation, the mean risk of error, and the probable error, any two of which may be deduced from the third by the rules in p. 137.
PROBLEM. Given either of the three data, required —(A) the chance that the error of any one observation shall lie between e positive and e negative, or that the observation shall give something between e too much, and e too little—(B) the chance that the preceding shall not happen—(C) the chance that the error shall be positive, but not exceeding e; or that it shall be negative, not exceeding e.
RULE. Multiply e by the square root of the weight, and let the product be t ; then (A) is the H corresto tin table I., and (B) is the remainder of (A) subtracted from unity; each of the chances called (C) is one half of (A).
EXAMPLE. The mean risk of error is 10 ; required the chance of the error lying between + 15 and—15 ; that is, between 15 too much, and 15 too little. Since the square root of the weight is 200 divided by 709 times the mean risk of error 10, or .—Ti„, and since 15 times this result is or 42 ; the probability required is 45 ; or 11 to 9 against the event. The probability of a positive error less than 15 is 225, and the same for a negative error within the same limits.
When all the individual observations are made under the same circumstances, so as to have the same weight.
144    ESSAY ON PROBABILITIES.
the common average is the most probable truth, and its weight is as many times the weight of one observation as there are observations in all: or, the average of n observations each of which has the weight w, is entitled to the same confidence as one observation made under circumstances which give it a weight nw. Of this we have an example in p. 137., and I now give another, detached from the method there given of finding the weight.
EXAMPLE. The weight of each observation being 18, what is the probability that the average of 50 observations lies within 05 of the truth. The weight of this average is 18 x 50 or 900, the square root of which is 30. Now 05 x 30 is P5, which, being t, H is 966, so that the probability required is about 23 to 1 in favour of the event. The mean risk of error of such an average is in divided by 30, or less than 01 ; by which it is meant that if repeated sets of 50 observations each were made, the errors of these sets, neglecting their signs, would not average so much as 01 x 2 or 02 each.
Let us now suppose that positive and negative errors are not equally likely. Hitherto, the absolute truth has been the most likely ; that is, though the proof ally one observation giving mathematical truth was infinitely small, yet so was that of any given error being exactly attained, and the infinitely small probability of the first case was greater than that of the second.* Let us now suppose that errors equal to or near to P are more probable than any other, and so that, x being the truth, any observation is equally likely to exceed or to fall short of (not x) but x + P. This is equivalent to describing a curve of the following figure in place of that in p. 132. Here A B is to be
* To compare the proportions of these indefinitely small probabilities, say of absolute truth and of either the error }e, or that of — e exactly, take H' from table I., corresponding to 0, and to e multiplied by the square root of c. Thus, c being 1, the probabilities of an error 0 and an error 4 are as P1 to '00, or as 55 to I.
-
ON eRRORS OF OBSeRVATION.    145

measured, BK is P, and the chance of any error failing between 0 and B N is such a fraction of unity as the area BPN is of the whole area of the curve. The case before us is precisely that of an observer with a personal error equal to P, in addition to casualties. If we imagine a very large number of observations, made under circumstances equally favourable to positive and negative error, and if the error P be added to or taken away from each of the results, according as P is a positive or a negative error, we shall then have such a succession of results as might be looked for on the present hypothesis. For instance, let the quantity P be 10, then whatever may be the prospect of having the error 1, when positive and negative errors are equally likely, we have the same chance for the error 11, in the present case.
For simplification, I shall adopt the algebraical method of signifying positive and negative errors. Thus + 3 means 3 too much, or 3 added to the truth ; — 4 means 4 too little. Thus + 4 — 5 is — 1 ; meanthat four too much from one cause, and 5 too little from another, gives on the whole 1 too little. Again, we consider — 2 less than — 1, since the effect of the former is to lessen the result more than would be done by the latter. To show that the supposition now before us is equivalent to that of an observer with a personal error, or an instrument with an individual error, imagine an instrument wrongly graduated, so that for every reading we ought to read 10 less : thus for 125 we ought to read 115. In other respects let the instrube equally likely to give positive and negative departures from truth. If then an observation give 176, we know immediately that the truth is 176 .— 10
L
146    eSSAY ON PROBABILITIeS.
+ a casual error, the effect of which disappears in a large number of observations. The whole error then is 10 + the casual error, 10 + x and 10 — x being equally likely. This is precisely the hypothesis in question, in the case in which P = 10.
The average of observations, in the case before us, does not necessarily give an approximation to the truth. Calling the quantity P a fixed error, meaning by a fixed error one which is as likely to be exceeded as not, we see, in the following theorem, a justification of the term : the most probable result of a large number of observations is the truth, increased or diminished by the fixed error, according as it is positive or negative. This error may either be fixed in the instrument, fixed in the observer, or in different degrees in both.
Let the phenomenon to be observed be perfectly unknown, except by what the instrument tells us. Then it is totally impossible to discover the amount of this error ; which, nevertheless, must be assumed to exist, unless the contrary be shewn. For example, an instrument wrongly graduated throughout could never tell the truth, either in individual or average results. But it is obvious that such an apparatus, incapable as it is of telling absolute truths, might nevertheless detect such results as are obtained by measuring the differences of other results. Thus a clock which goes truly, but is set too fast or too slow, will serve to find the time elapsed between two events, though it will not show the real time of either. Instruments which, on account of some permanent error affecting all their results, can only be used to determine differences, are called differential instruments.
All instruments, as well as observers, are subject more or less to this species of error ; how then is it possible to depend upon the results of any observations ? The answer to this question will require some detail. Since perfect exactness cannot be attained, either on the part of the instrument, or of the observer, we can only call either good, when positive errors are as likely
ON ERRORS OF OBSERVATION.    147
as negative. The average of a large number of observwill in such a case be extremely near the truth, and provided this condition can be fulfilled, the absolute amount of the tendency to error is comparatively unLet two observers, A and B, have instruments the average error of the first of which is double of that of the second. A given number of observations made by A is not as likely to be within a given amount of the truth in the average as the same number made by B. But the former may more than make the difference, by taking a larger number of observations. The rule is, that the square roots of the numbers of observations must be in proportion to the average errors of the instruments. That is, if A's instrument have an average error double of that of B, he must make four times B's number of observations before he can place the same reliance upon his own observations which he ought to do upon those of B. And the same is true if for average error we read mean risk, or probable error. But if the weights of the observations be known, the numbers of observations (and not their square roots), must be inversely as the weights.
When, however, there is a fixed error in the instruindependently of casual errors, or of such as are as likely to be positive as negative, there are two modes of proceeding. The average of the observations will now be too great or too small, according as the fixed error is positive or negative.
I. If the truth can be found by any other means, in any one instance, a large number of observations, such as would be made if the truth in that one in-stance were the object of inquiry, will serve to detect the fixed error, with a high degree of probability that the result shall be correct. If a result should be 29 and the average of 100 observations give 28, then it must be presumed that instead of errors + x and — .r being equally likely -1 + x and — 1 — x are equally likely, or there is a fixed error of — 1. If A be the true result, and if P be the fixed error, then A + P is
L 2
148    ESSAY ON PROBABILITIeS.
the result which the instrument would give, in the long run. In p. 137. is shewn the method of determining the chance that in s observations, the average should lie within e of the final average, that is, within e of A + P. This ascertains the chance that the final average just mentioned lies between A + P + e and A + P — e. Let R be the result shown by the instrument, the truth A being otherwise known. Then if R lies between A + P — e and A + P + e, it follows that P lies between R — A + e and R — A — e, and the chance of the first is that of the second.
EXAMPLE. The truth being known to be 30, and the average of 20 observations giving 31, what is the chance that there is in the instrument a fixed error lying between 1 + 1 and 1 — or between and ;.
The weight of the observations must first be found, which is done by summing the squares of the errors, taking the average given by the instrument as true, precisely in the manner used in p. 137. Suppose this weight to be 10; then e or -, multiplied by the square root of 10, or 3.162, is 79, to which in table I., the value of H is 74. It is therefore about 3 to 1 that the instrument, in the long run, would give a result between 31 + 4 and 31 - -1 ; that is, that there is a fixed error in the instrument lying between 1 + a and 1
This first method, then, of ascertaining the fixed error of a set of observations, supposes that there are cases in which the result is known beforehand, so that the instrument may first be read by the aid of pheinstead of phenomena by the instrument. The first observation is that of the error of the latter, found by comparing its indications with the known truth ; the second, the observation of unknown phenomena, follows : accurate results being obtained, not by altering the instrument, but by applying the correction to the observations which the preceding class of observations has rendered necessary.
A reader unused to astronomical works, on opening
ON eRRORS OF OBSeRVATION.    149
a book on the practical part of the science, might imagine that no part of the subject pretended even to ordinary accuracy. Nothing appears to be done which is unaffected by serious error; and it seems as if a little more attention to the fabrication of instruments would render nine tenths of what has been written altogether useless. This appearance is the victory of the head over the hands ; the means of detecting the errors of instruare much more powerful than those of correcting them. It is also the victory of astronomy over the other physical sciences, on our knowledge of which the manufacture of utensils depends : we know more of the laws which regulate the changes of the heavens than we do of those on which the stability and fluctuof instruments depend. Nor does the semblance above mentioned entirely spring from unavoidable error: for it is frequently the most convenient plan to allow an error to subsist which might be corrected at once, but which may be more easily corrected at another stage of the process. It is also sometimes useful to allow an error to remain of a larger magnitude than is physically necessary, if by that means another risk of error may be avoided. For instance, if recorrection be either an addition or a subtraction, sometimes one and sometimes the other, the most practised calculator will be very liable to confound the two. This may he remedied by allowing to the instrua fixed error, either additive or subtractive, of such a magnitude that casual fluctuations will never alter its name. The correction, therefore, will always require the same process, and the risk of error arising from taking the wrong method will be avoided.
2. The next plan of eliminating the fixed errors of an instrument is by giving it such a construction that an observation can be made in two different ways, in which the fixed error must necessarily have different signs, and must be of the same amount in both cases. This is in reality a method of making the positive and negative errors of the same amount in
L 3
150    ESSaY ON PROBABILITIES.
the long run. Suppose, for example, that (as in the transit-instrument) the correctness of the observations depends (partly) upon the line of sight of a telescope being always exactly perpendicular to the axis upon which the telescope turns. Such exact perpendicularity is a mathematical fiction, which was never yet realised : the telescope will incline more or less to the right or to the left. But if the telescope be fixed to its axis, and if the axis itself rest on pivots, from which it can be taken off and the position of the instrument reversed, it is obvious that such a reversal of the ends of the axis will alter the error of the instrument, throwing the line of sight as much to the left as it was before to the right, or vice versd. The average of a large number of observations will now present no signs of fixed error, arising from this cause at least : provided that the numbers of observations made in the two different positions be equal. The chance of the average of s observations in each position lying within a given degree of nearness to the truth is precisely that of twice s observations made with no fixed error, and the same tendency to positive and negative casualties as before. When the result has been obtained by combination of the different sets, the fixed error of the instrument may be ascertained by comparing the combined average with the separate averages of each set. If the observations be numerous, and the reversal of the method of observing introduce no new errors, then the combined average will be an arithmetical mean (or nearly so) between the other averages, and the difference between the former and either of the latter will be the fixed error required.
Independently of one or other of these two methods, the only result directly furnished by the observations (except the average affected by the fixed error) is their weight, which is obtained precisely as in p. 137. It will be seen that either of the preceding methods introduces entirely new elements ; in the first we have previous truths for comparison with subsequent ob-
OJT ERRORS OF OBSERVATION.    151
servations; in the second, an adaptation of the instruwhich reproduces an equal likelihood of positive and negative error, by making the fixed error itself as often positive as negative. What we have called a fixed error is in fact a part of the phenomenon, styled an error because it is not a part of the result we wish to observe. The errors which a simple application of our theory removes are those of which no account whatever can be given, and of which nothing can be previously known. Such is not the case when positive error is more likely than negative, or vice versa: for this very circumstance is itself a phenomenon, which must arise from some unvarying cause.
Having stated that it is indifferent, in a mathematical point of view, whether the law of the facility of error above explained be true or not, because any law whatwhich falls within the widest permission of common sense leads to the same results as above explained, when the number of observations is consider—I will now point out, in some simple cases, how different laws of error are to be reduced to the preceding.
  1. Let all errors, positive and negative, between + F and — E be equally probable, and all others impossible. Treat large numbers of such observations as in pages 137. and 144., on the supposition that the weight is 3 divided by twice the square of E.
  2. Let the probability of error decrease uniformly as the magnitude increases, the greatest possible errors being + E and — E ; which implies that the chance of an error lying between — x and + x is the product of x and the remainder of 2 E divided by the square of E. For instance, if E be 10, and this law of facility prevail, the chance of an error lying between — 2 and + 2, is the product of 2 and 18 divided by 10 times 10, or o d. In this case a large number of observmust be treated as if the weight of each observwere 3 divided by the square of E.
  3. 3. If the weight of the observations be considered r. 4152ESSAY ON PROBABILITIES.to be 5 divided by twice E2, the preceding methods will show the chances of a large number of observupon a supposition intermediate between the two last, and coinciding nearly with the first when the errors are small, and with the second when they are considerable.I now proceed to the method of combining the results of observation, and deducing the mean risks of error. Suppose, for example, that A and B are two results of a large number of observations, of which the product is required. Nothing can be more erroneous than to suppose that the mean risk of this product will be the product of the mean risks of its factors. By the mean risk here is meant the same thing as before : imagine the product of A x B formed in every possible way from the single results of the several observations. Each error will be as likely to be positive as negative, if the errors of the original observations be the same. Take the average of all the several errors, neglecting their signs, and one half of this average will be (in the long run) what is called the mean risk, explained in the same manner as in p. 135.The risk of the result will be modified by the manner in which the operation makes one quantity affect the error of the other. Suppose, for example, that observation gives A = 100, and B = 150. If these were certainly true, the required result would be accurately 100 x 150, or 15,000. But if 100 be wrong to any amount, the product will be wrong by 150 times that amount, on that account only : while, if B be wrong, its error will be multiplied 100 times in the result ; besides which, there will be an additional error, the product of the two errors of A and B. On the other hand, if observation give A too large and B too small, the opposite errors may either compensate each other exactly, and give the product precisely what it ought to be, or may make some approach towards this compensation. The product then may be renderedON eRRORS OF OBSERVaTION.153much more erroneous than the observations, or much less so ; both of which possible cases are considered, in all their extent, in the investigations which give the following results. In all of them, except the first, the mean risks are supposed to be small.
  4. To find the mean risk of the sum or difference of any number of quantities determined by observation, add together the squares of all their mean risks, and extract the square root of the result. Thus, if the mean risks of two quantities be 3 and 4, that of their sum or difference is the square root of 16 + 4, that is, 5. If the mean risks be all equal, the rule may be simplified into that of multiplying the mean risk of one by the square root of the number of quantities. Thus the mean risk of the sum of 100 observed quantities of equal risk is 10 times that of one of them.
  5. EXaMPLE. Given the mean risks of A, B, and C, namely, 1, 2, and 3, required that of 10 A + 9 B — 4 C. Here every error which can happen in A is made tenfold in 10 A, and the mean risk of 10 A is 10 x 1 or 10. Similarly the mean risks of 9 B and 4 C, are 9 x 2 and 4 x 3 or 18 and 12. The squares of 10, 18, and 12, added together, give 568, the square root of which is 23.8, the mean risk required.It may seem strange at first sight that, ceteris paribus, the mean risk of a sum and difference should be the same. But a little consideration will show that, positive and negative errors being equally likely, the errors of a difference may be as large as those of a sum : and that no combination of errors can affect a sum, without an equal probability of another equally procombination affecting the difference in the same way.
  6. To find the mean risk of the product of any number of quantities A, B, C, &c. Take the fraction which each mean risk is of its quantity : add the squares of these fractions, and multiply the square root of the result by the product itself. This rule is
  7. 154ESSAY ON PROBABILITIeS.only to be trusted when the mean risks are small. Let A= 100, B = 150, and let the mean risks of A and B be 1 and 2. Then 1 is O1 of 100, and 2 is 0133 of 150. The squares of 01 and 0133 are 0001 and 00017689, the sum of which is 00027689, the square root of which is 0167. This multiplied by 100 x 150, or 15,000, gives 250.5, which is the mean risk of the product 15,000.4. To find the mean risk of a fraction, or of the quotient of a division, multiply each term (numerator and denominator, or dividend and divisor) by the mean risk of the other, add the squares of these products and extract the square root of the sum : divide this by the square of the denominator or divisor ; the result is the mean risk required. But if the fraction be very small, it is sufficient to divide the mean risk of the numerator by the denominator ; while if the fraction be very great, it is sufficient to multiply the fraction by the risk of the denominator, and to divide the result by the denominator.The preceding will serve as specimens of the manner in which complicated results of operation can have those probabilities investigated which depend upon the probabilities of error in their constituent parts. It would be impossible to lay before a reader unacquainted with the differential calculus, any such digest of rules as would enable him to treat all cases with facility. Any one of the mean risks obtained above will serve to determine, as in p. 139., the weight of the result, from which its law of error may be investigated, as in p. 143.It appears that the chances of error may be considerably multiplied in the course of the operations to which the results of observation are subjected. It must, therefore, be the object of an inquirer not only to make good observations, but also to select such methods of observing, and such methods of treating the observations (the latter generally depending upon the former), as will render the final error the leastON ERRORS OF OBSERVATION.155possible. The considerations necessary for this purform a great part of the application of matheto the sciences of observation : in which it frequently happens that good methods of observing are rendered useless by the multiplication of error which the methods consequent upon them involve : and conversely, that formulae good in other respects, are in-admissible from the tendency to error in the observations which they require. And it has happened before now that mistakes of serious amount have arisen from the use of mathematical methods in which the errors of the observations are much multiplied.I could hardly close such a chapter as the present without some mention of the celebrated method of least squares, on which the astronomy of the last thirty years has depended for much of the increase of ac-curacy which has been its characteristic. But as any development of this very interesting subject is impracticable without recourse to mathematical symbols and reasoning, I content myself with a description of one particular case, which is of very frequent occurrence.Suppose a number of results to be obtained by observfrom which a consequence is to be drawn by mathematical reasoning. If the observations were all correct, the consequence deduced from any one would be the same as that from any other ; but owing to the errors of the observations, such agreement is of course unattainable. It is, therefore, a question what method of combining the several results should be adopted : and mathematical analysis shows that the object is attained by choosing such an intermediate result as shall make the sum of the squares of the errors the least possible. It might seem as if, positive and negaerrors being equally probable, the average of results is the most probable truth ; and this is the case when the observation is itself made directly upon the result which is required, or when there is only one datum into which the uncertainty of observation is introcu156eSSaY ON PROBABILITIES.duced. But even in such a case we have no right to say that the average is preferred to the result of the method of least squares ; for the former is then a parcase of the latter. Let three observations give 9, 11, and 16, the average of which is 12. The errors, taking this average as the truth, are 3, 1 and 4, the sum of the squares of which is 26. This is the least possible sum of the squares. To try this, assume 11 as the most probable truth : the errors are then 2, 0, and 5, the sum of the squares of which is 29 : assume 13, and the errors are 4, 2, and 3, the sum of the squares of which is 29. To avoid introducing fractions, I have only assumed whole numbers, but if I take 12.1 or 11.9, I find the sum of the squares of the errors to be in the first case 26.03, and in the second 26.03. So that when one result only is in question, a direction to take the average is equivalent to a direction to make the sum of the squares of the errors the least possible.Let us now suppose two results of observation, say that we wish to know the fraction which A is of B, where both A and B are subject to errors, the positive and negative being equally likely. Suppose, for ex-ample, that we ask for the proportion of the population of a country which is buried in a year. Statistical returns will furnish the population of each year, and the burials, both subject to errors. There are now obviously two ways of taking an average ; I may either divide the average burials by the average population, or find the proportion which the first is of the second in each year, and take the average of the pro-portions. One not versed in mathematics would suppose that these must give the same results, but any simple instance will show the contrary. The average dividend and the average divisor do not give the average quotient. For instance, let dividends be 12, 13, and 17, and divisors 20, 22, and 30. The fractions-;, ands are '6, 591, and 567, the average of which is 536.The average dividend is 14, the averageON ERRORS OF OBSERVATION.157denominator 24 and it is not '586, but 583. The results nearly coincide, but so do the data, for which reason the most probable result may be very nearly found, or of two results which differ very little, one may be much more probable than another. When observations give magnitudes so nearly coinciding as 6 '591 and 567, it is worth while to examine the relative probabilities of methods which give results so nearly equivalent as '586 and 583. Which of the two preceding methods is most entitled to confidence ? Analysis points out that this question is useless, because there is a third method which is more safe than any other. The method of least squares in the case before us leads to the following rule ; —When both the numerator and denominator of a fraction are to be determined by observation, and various corresponding observations of both are made, multiply each numerator and denominator by the denominator, and divide the sum of the numerators so formed by the sum of the denominators. Thus in the preceding instance, it is12x20+13x22+17x30 103620 x 20422 x 22+30 x 30 or 1784 or 581which is more probable than either '586 or 583.If the mean risks of all the observations be the same, the mean risk of the preceding result is found by adding 1 to the square of the result obtained (581) dividing by the denominator which produced it (1784), extractthe square root of the quotient, and multiplying the mean risk of each observation by this square. Thus 581 x'581 is 337561, which divided by 1784 gives '000189, the square root of which is 014. The mean risk of the result, therefore, is less than one seventieth part of that of each of the observations.The method of least squares is an extension of that of taking an average, or rather it indicates the most probable average in cases which, by reason of more results of observation than one being involved, an in-finite number of different averages exists. It is not