DIGITS
-- the significant digits
of floating point numbersThe environment variable DIGITS
determines the number
of significant decimal digits in floating point numbers. The default
value is DIGITS
= 10.
DIGITS
DIGITS := n
n |
- | a positive integer smaller than 2^31. |
float
, Pref::floatFormat
, Pref::trailingZeroes
float
to exact numbers or numerical
expressions. Elementary objects are approximated by the resulting
floats with a relative precision of 10^(-DIGITS)
, i.e.,
the first DIGITS
decimal digits are correct. Cf.
example 1.DIGITS
decimal digits are taken into account. The
numerical error propagates and may grow in the course of computations.
Cf. example 2.x := 1.234
), a number with at least DIGITS
internal decimal digits is created. Note, however, that a conversion
error may occur, because the internal representation is binary.
If a real float is entered with more than DIGITS
digits, the internal representation stores the extra digits. However,
they are not taken into account in arithmetical operations, unless
DIGITS
is increased accordingly. Cf. example 3.
In particular, complex floating point numbers are created by adding the real and imaginary part. This addition truncates extra decimal places in the real and imaginary part.
DIGITS
may be changed at any time during
a computation. If DIGITS
is decreased, only the leading
digits of existing floating numbers are taken into account in the
following arithmetical operations. If DIGITS
is increased,
existing floating point numbers are internally padded with trailing
binary zeroes. Cf. example 4.DIGITS
, certain functions such as the
trigonometric functions may reject floats as too inaccurate and stop
with an error. Cf. example 5.DIGITS
, only significant digits of
floating point numbers are displayed on the screen. The preferences
Pref::floatFormat
and Pref::trailingZeroes
can be
used to modify the screen output. Cf. example 4.
At least one digit after the decimal point is displayed; if it is insignificant, it is replaced by zero. Cf. example 6.
For example, for DIGITS
= 10, the function float
converts exact numbers to
floats with about 19 decimal digits. The number of guard digits depends
on DIGITS
. For example,
for all DIGITS
from 10 through 19, the same internal
representation of about 19 decimal digits is used. In particular, there
is no guard digit for DIGITS
= 19. Cf. examples 4 and 7.
DIGITS
are global
variables. Upon return from a procedure that changes
DIGITS
, the new value is valid outside the context of the
procedure as well! Use save
DIGITS
to restrict the
modified value of DIGITS
to the procedure. Cf.
example 8.DIGITS
is 10
;
DIGITS
has this value after starting or resetting the
system via reset
. Also
the command delete DIGITS
restores the default value.float
for further information.We convert some exact numbers and numerical expressions to floating point approximations:
>> DIGITS := 10: float(PI), float(1/7), float(sqrt(2) + exp(3)), float(exp(-20))
3.141592654, 0.1428571429, 21.49975049, 0.000000002061153622
>> DIGITS := 20: float(PI), float(1/7), float(sqrt(2) + exp(3)), float(exp(-20))
3.1415926535897932385, 0.14285714285714285714, 21.49975048556076279, 0.000000002061153622438557828
>> delete DIGITS:
We illustrate error propagation in numerical
computations. The following rational number approximates
exp(2)
to 17 decimal digits:
>> r := 738905609893065023/100000000000000000:
The following float
call converts
exp(2)
and r
to floating point
approximations. The approximation errors propagate and are amplified in
the following numerical expression:
>> DIGITS := 10: float(10^20*(r - exp(2)))
320.0
None of the digits in this result is correct. A better
result is obtained by increasing DIGITS
:
>> DIGITS := 20: float(10^20*(r - exp(2)))
276.95725394785404205
>> delete r, DIGITS:
In the following, only 10 of the entered 30 digits are regarded as significant. The extra digits are stored internally, anyway:
>> DIGITS := 10: a := 1.23456789666666666666666666666; b := 1.23456789444444444444444444444
1.234567897 1.234567895
We increase DIGITS
. Because the internal
representation of a
and b
is correct to 30
decimal place, the difference can be computed correctly to 20 decimal
places:
>> DIGITS := 30: a - b
0.00000000222222222222222222222
>> delete a, b, DIGITS:
We compute a floating point number with a precision of
10 digits. Internally, this number is stored with about 9 guard digits
to 19 correct digits. Increasing DIGITS
to 30, the correct
guard digits become visible. The remaining 11 decimal digits are
created by padding the internal representation with binary zeroes. In
the output, the internal representation is converted into a decimal
representation. This converts the trailing binary zeroes to 11
nontrivial decimal digits. With the the call
Pref::trailingZeroes(TRUE)
, trailing zeroes of the decimal
representation become visible:
>> DIGITS := 10: a := float(1/9)
0.1111111111
>> Pref::trailingZeroes(TRUE): DIGITS := 30: a
0.111111111111111111109605274000
>> Pref::trailingZeroes(FALSE): delete a, DIGITS:
For the float evaluation of the sine function, the
argument is reduced to the standard interval [0, 2*PI]. For
this reduction, the argument must be known to some digits after the
decimal point. For small DIGITS
, the digits after the
decimal point are pure round-off if the argument is a large floating
point number:
>> DIGITS := 10: sin(float(2*10^20))
0.9576594803
Increasing DIGITS
to 50, the argument of
the the sine function has about 30 correct digits after the decimal
point. The first 30 digits of the following result are reliable:
>> DIGITS := 50: sin(float(2*10^20))
-0.9859057707420871849896773829691365946134713391129
For very large floating point arguments, MuPAD's
trigonometric functions produce errors if DIGITS
is not
large enough:
>> DIGITS := 10: sin(float(2*10^30))
Error: Loss of precision; during evaluation of 'sin'
>> DIGITS := 50: sin(float(2*10^30))
0.17950046751493908795061771243112520647287791588203
>> delete DIGITS:
At least one digit after the decimal point is always
displayed. In the following example, the number 3.9
is
displayed as 3.0
to indicate that the digit 9 after the
decimal point is not significant:
>> DIGITS := 1: float(PI), 3.9, -3.2
3.0, 3.0, -3.0
>> delete DIGITS:
We compute float(10^40*8/9)
with various
values of DIGITS
. Rounding takes into account all guard
digits, i.e., the resulting integer makes all guard digits visible:
>> for DIGITS in [9, 10, 11, 19, 20, 21, 28, 29, 30] do print("DIGITS" = DIGITS, round(float(10^40*8/9))) end_for:
"DIGITS" = 9, 8888888887243627086483687557525021917184 "DIGITS" = 10, 8888888888888888888303079319556646240256 "DIGITS" = 11, 8888888888888888888303079319556646240256 "DIGITS" = 19, 8888888888888888888303079319556646240256 "DIGITS" = 20, 8888888888888888888888888888827804909568 "DIGITS" = 21, 8888888888888888888888888888827804909568 "DIGITS" = 28, 8888888888888888888888888888827804909568 "DIGITS" = 29, 8888888888888888888888888888888888888864 "DIGITS" = 30, 8888888888888888888888888888888888888864
The results show that the internal representation
coincides for values of DIGITS
between 10 and 19.
Increasing DIGITS
to 20 leads to an extended internal
representation which is constant through DIGITS
= 28. From
DIGITS
= 29 on, a yet more extended internal
representation is used etc.
The following procedure allows to compute numerical
approximations with a specified precision without changing
DIGITS
as a global variable. Internally,
DIGITS
is set to the desired precision and the float
approximation is computed. Because of save DIGITS
, the
value of DIGITS
is not changed outside the procedure:
>> Float := proc(x, digits) save DIGITS; begin DIGITS := digits: float(x); end_proc:
The float approximation of the following value
x
suffers from numerical cancellation. In particular, for
DIGITS
= 9 no internal guard digits are available, and the
value computed by float
has only 3 correct leading digits. Float
is used to
approximate x
with 30 digits. The result is displayed with
only 9 digits because of the value DIGITS
= 9 valid
outside the procedure. However, all displayed digits are correct:
>> x := PI^7 - exp(80131/10000): DIGITS := 9: float(x), Float(x, 30)
0.0277910233, 0.0277894265
>> delete Float, x, DIGITS:
x
has been created with
high precision, and the computation is to continue at a lower
precision, the easiest method to get rid of memory-consuming
insignificant digits is x := x + 0.0
.DIGITS
inside a procedure now are valid
outside the procedure as well, unless it is implemented with save
DIGITS
.