Reality of REAL Arithmetic's Really Wrong Results
By Skip Knoble, - April 2005
Translating a formula, which is what a compiler does, is different from
programming a formula, testing and debugging it as a model, and verifying
that all plausible results are correct. Computer displayed or printed
output, no matter how pretty or sophisticated, cannot challenge or correct
wrong answers resulting from faulty numerical computations. Only skilled
humans can do this. This is not only an academic challenge, but an ethical
one as well. At the bottom line, results of numerical computations can and
do affect people. Are not people responsible for people?
A PC (IEEE Floating-point model) Example
Some computer computations can break down very quickly
and drastically. (Worse yet are ones which break down more subtly!)
program Example1
!
!=== Example of numeric breakdown;
! For example try: A=10000., X=0.00001
! and see P computed as: -68.7 in Single Precision
! or: -119.2. in Double Precision.
! Note that P is algebraically equal to 1.0 for all A and X<>0.
!
! Penn State Information Technology Services
! Graduate Education and Research Services (GEaRS)
! Skip Knoble - 1981
integer, parameter :: PRECISION = selected_real_kind(6)
! integer, parameter :: PRECISION = selected_real_kind(15)
!
real(kind=PRECISION) :: P,A,X
DO
PRINT *, "Evaluating P=((A+X)**2 - A**2 - 2*A*X)/X**2"
PRINT *, "Please enter A and X (or Zeros to Quit):"
READ *, A,X
IF (A == 0 .OR. X == 0) EXIT
P=((A+X)**2 - A**2 - 2.0*A*X)/X**2
print *, "P is algebraically equal to 1."
print *, "P=",P," A=",A," X=",X
END DO
print *, "** Example 1 Ended **"
end program Example1
Some Results (the algebraic result is, of course, 1.0).
Read A=100. X=0.01 and the computed result, P = 5.3 in Single Precision
or 1.0(correct) in Double Precision.
For A=10000., X=0.00001 the computed result, P = -2.0E+09 in Single Precision
and -49.5 in Double Precision.
Here is the actual output for the above Double Precision case:
Evaluating P=((A+X)**2 - A**2 - 2*A*X)/X**2
Please enter A and X (or Zeros to Quit):
10000 0.00001
P is algebraically equal to 1.
P= -49.5056157705101330 A= 1.0000000000000000E+04 X= 1.0000000000000001E-05
Interesting enough, the bigger the "spread" of A and X in general the
bigger the "wrong" answer. However a correct result is computed even with a
very large spread between A and X if the rolls of A and X are reversed (X
large, A, small):
Evaluating P=((A+X)**2 - A**2 - 2*A*X)/X**2
Please enter A and X (or Zeros to Quit):
.00001 10000
P is algebraically equal to 1.
P= 1.0000000 A= 9.9999997E-06 X= 1.0000000E+04
The problem here is subtracting nearly equal quantities which are formed by
converting decimal numbers that have infinite binary expansions; when
subtracting nearly equal quantities, the most significant digits cancel
during the subtraction, thus shifting the "noise" toward the most
significant digits. Then dividing by X**2 is like multiplying the "noise"
by a very large number (scaling it up). "Really wrong results" here
illustrate, with very little arithmetic, so-called "significance loss".
Example of infinite binary expansion: 1/10th in binary is:
1.00000000000 divided by 1010 = 0.000110011001100110011...
Why Does Significance Loss Happen?
a) There is a finite set of values that a floating-point number, X can take on.
This is because the number of bits to represent a floating-point number is finite
(and relatively small). The ranges of floating-point numbers is also finite. For the
PC, please see: http://stevehollasch.com/cgindex/coding/ieeefloat.html
b) Floating-point numbers do not necessarily have an inverse under multiplication;
that is, there are X's in the system for which no Y exists in the system such that
X*Y = 1. In general the number of floating-point whole numbers which do not have
inverses under multiplication is much larger than those which do.
c) Arithmetic operations are not necessarily closed on floating-point number
systems. That is, there are X;s in the system and Y's in the system for which
X+Y, X-Y, X*Y, and/or X/Y do not belong to the system. For exmple X=900001 and
Y=0.825 base 10 both belong to the PC single precision floating-point system,
but neither their sum, difference, product or ratio belong to this system.
d) The Associative laws and distributive law of algebra do not necessarily
hold in floating-point number systems. That is, in all floating-poing number
systems there are many X, Y, Z's belonging to the system such that:
(X+Y)+Z <> X+(Y+Z), (X*Y)*Z <> X*(Y*Z), and X*(Y+Z) <> X*Y + X*Z. These
laws of algebra fail to hold because results may be shifted past the capacity
of a floating-point register. This is sometimes called "significance loss".
Recommendations to Help Avoid Numerical Computational Difficulties.
1) Use Proven (refereed or commercial) software that is robust
(give the correct result, or if not, an error message). For example,
programming language libraries, IMSL, NAG, and the free "PACK" (Linpack,
Minpack) available on http://wwww.netlib.org). Or use higher level
quality software like: Minitab, MatLab, Mathematica. Also avoid
taking numerical code from textbooks and magazines unless they are
refereed or presented by a proven expert in numerical computations.
2) Study (know) the possible domain of values that each program input
quantity will take on and test each algorithm thoroughly, especially on
the limits of that domain. (This of course is an art in itself).
3) Use at least Double Precision Arithmetic. (In more critical cases use
multiple or quadruple precision arithmetic; Fortran 90 programmers, see
http://www.nersc.gov/~dhbailey/ and
http://www.ozemail.com.au/~milleraj/ respectively. Packages like
Matlab and Mathematica use at least Double Precision, sometimes
higher precision.
4) For a given mantissa precision (computer) do not input or output numbers
in ASCII (or EBCDIC) format with more decimal digits than can be handled by the
conversion mapping algorithms for that precision (computer) unless you
document that that's what you are doing.
5) Use relative comparisons in place of absolute comparisons of real
quantities. (Also note that you can also use "fuzzy" comparisons. See
http://ftp.aset.psu.edu/pub/ger/fortran/hdk/eps.f90"
6) Avoid computations which are algebraically undefined (or singular) or which
develop results whose magnitudes approach or exceed machine limits.E.g.,
Blindly setting results of underflows to zero can yield "Really Wrong Results".
7) Avoid mixing INTEGER and REAL quantities without knowing how they will
be compiled. E.g., X=X+I**(-J) does not increment REAL typed X except when
INTEGER typed I=1 or INTEGER typed J=0 and I < 0.
8) Do not program a numerical model beyond your competence to either avoid
significance loss or monitor it. Be especially careful when less than a pre-specified
number of mantissa digits participate in a sum or when all but a few mantissa
digits cancel during a subtraction. You are responsible for disasters caused
by other people depending on programs you write!
Further Reading: http://www.personal.psu.edu/hdk/fortran.html#Floating-Point
Herman D. (Skip) Knoble, hdk at psu dot edu