Reality of REAL Arithmetic's Really Wrong Results By Skip Knoble, - April 2005 Translating a formula, which is what a compiler does, is different from programming a formula, testing and debugging it as a model, and verifying that all plausible results are correct. Computer displayed or printed output, no matter how pretty or sophisticated, cannot challenge or correct wrong answers resulting from faulty numerical computations. Only skilled humans can do this. This is not only an academic challenge, but an ethical one as well. At the bottom line, results of numerical computations can and do affect people. Are not people responsible for people? A PC (IEEE Floating-point model) Example Some computer computations can break down very quickly and drastically. (Worse yet are ones which break down more subtly!) program Example1 ! !=== Example of numeric breakdown; ! For example try: A=10000., X=0.00001 ! and see P computed as: -68.7 in Single Precision ! or: -119.2. in Double Precision. ! Note that P is algebraically equal to 1.0 for all A and X<>0. ! ! Penn State Information Technology Services ! Graduate Education and Research Services (GEaRS) ! Skip Knoble - 1981 integer, parameter :: PRECISION = selected_real_kind(6) ! integer, parameter :: PRECISION = selected_real_kind(15) ! real(kind=PRECISION) :: P,A,X DO PRINT *, "Evaluating P=((A+X)**2 - A**2 - 2*A*X)/X**2" PRINT *, "Please enter A and X (or Zeros to Quit):" READ *, A,X IF (A == 0 .OR. X == 0) EXIT P=((A+X)**2 - A**2 - 2.0*A*X)/X**2 print *, "P is algebraically equal to 1." print *, "P=",P," A=",A," X=",X END DO print *, "** Example 1 Ended **" end program Example1 Some Results (the algebraic result is, of course, 1.0). Read A=100. X=0.01 and the computed result, P = 5.3 in Single Precision or 1.0(correct) in Double Precision. For A=10000., X=0.00001 the computed result, P = -2.0E+09 in Single Precision and -49.5 in Double Precision. Here is the actual output for the above Double Precision case: Evaluating P=((A+X)**2 - A**2 - 2*A*X)/X**2 Please enter A and X (or Zeros to Quit): 10000 0.00001 P is algebraically equal to 1. P= -49.5056157705101330 A= 1.0000000000000000E+04 X= 1.0000000000000001E-05 Interesting enough, the bigger the "spread" of A and X in general the bigger the "wrong" answer. However a correct result is computed even with a very large spread between A and X if the rolls of A and X are reversed (X large, A, small): Evaluating P=((A+X)**2 - A**2 - 2*A*X)/X**2 Please enter A and X (or Zeros to Quit): .00001 10000 P is algebraically equal to 1. P= 1.0000000 A= 9.9999997E-06 X= 1.0000000E+04 The problem here is subtracting nearly equal quantities which are formed by converting decimal numbers that have infinite binary expansions; when subtracting nearly equal quantities, the most significant digits cancel during the subtraction, thus shifting the "noise" toward the most significant digits. Then dividing by X**2 is like multiplying the "noise" by a very large number (scaling it up). "Really wrong results" here illustrate, with very little arithmetic, so-called "significance loss". Example of infinite binary expansion: 1/10th in binary is: 1.00000000000 divided by 1010 = 0.000110011001100110011... Why Does Significance Loss Happen? a) There is a finite set of values that a floating-point number, X can take on. This is because the number of bits to represent a floating-point number is finite (and relatively small). The ranges of floating-point numbers is also finite. For the PC, please see: http://stevehollasch.com/cgindex/coding/ieeefloat.html b) Floating-point numbers do not necessarily have an inverse under multiplication; that is, there are X's in the system for which no Y exists in the system such that X*Y = 1. In general the number of floating-point whole numbers which do not have inverses under multiplication is much larger than those which do. c) Arithmetic operations are not necessarily closed on floating-point number systems. That is, there are X;s in the system and Y's in the system for which X+Y, X-Y, X*Y, and/or X/Y do not belong to the system. For exmple X=900001 and Y=0.825 base 10 both belong to the PC single precision floating-point system, but neither their sum, difference, product or ratio belong to this system. d) The Associative laws and distributive law of algebra do not necessarily hold in floating-point number systems. That is, in all floating-poing number systems there are many X, Y, Z's belonging to the system such that: (X+Y)+Z <> X+(Y+Z), (X*Y)*Z <> X*(Y*Z), and X*(Y+Z) <> X*Y + X*Z. These laws of algebra fail to hold because results may be shifted past the capacity of a floating-point register. This is sometimes called "significance loss". Recommendations to Help Avoid Numerical Computational Difficulties. 1) Use Proven (refereed or commercial) software that is robust (give the correct result, or if not, an error message). For example, programming language libraries, IMSL, NAG, and the free "PACK" (Linpack, Minpack) available on http://wwww.netlib.org). Or use higher level quality software like: Minitab, MatLab, Mathematica. Also avoid taking numerical code from textbooks and magazines unless they are refereed or presented by a proven expert in numerical computations. 2) Study (know) the possible domain of values that each program input quantity will take on and test each algorithm thoroughly, especially on the limits of that domain. (This of course is an art in itself). 3) Use at least Double Precision Arithmetic. (In more critical cases use multiple or quadruple precision arithmetic; Fortran 90 programmers, see http://www.nersc.gov/~dhbailey/ and http://www.ozemail.com.au/~milleraj/ respectively. Packages like Matlab and Mathematica use at least Double Precision, sometimes higher precision. 4) For a given mantissa precision (computer) do not input or output numbers in ASCII (or EBCDIC) format with more decimal digits than can be handled by the conversion mapping algorithms for that precision (computer) unless you document that that's what you are doing. 5) Use relative comparisons in place of absolute comparisons of real quantities. (Also note that you can also use "fuzzy" comparisons. See http://ftp.aset.psu.edu/pub/ger/fortran/hdk/eps.f90" 6) Avoid computations which are algebraically undefined (or singular) or which develop results whose magnitudes approach or exceed machine limits.E.g., Blindly setting results of underflows to zero can yield "Really Wrong Results". 7) Avoid mixing INTEGER and REAL quantities without knowing how they will be compiled. E.g., X=X+I**(-J) does not increment REAL typed X except when INTEGER typed I=1 or INTEGER typed J=0 and I < 0. 8) Do not program a numerical model beyond your competence to either avoid significance loss or monitor it. Be especially careful when less than a pre-specified number of mantissa digits participate in a sum or when all but a few mantissa digits cancel during a subtraction. You are responsible for disasters caused by other people depending on programs you write! Further Reading: http://www.personal.psu.edu/hdk/fortran.html#Floating-Point Herman D. (Skip) Knoble, hdk at psu dot edu