next up previous notation contents
Next: 2.5.4 Algebraic Properties Up: 2.5 Floating Point Previous: 2.5.2 NAN

2.5.3 Rounding

Floating point numbers approximate real numbers. Operations with floating point numbers approximate corresponding operations with real numbers. Consider the following addition operation:

math6274

Both tex2html_wrap_inline32409 and tex2html_wrap_inline32411 are members of tex2html_wrap_inline32413 ; tex2html_wrap_inline32415 is not.

When the implied real result of a floating point operation is not a floating point number the result is rounded to a floating point number. The most common form of rounding is ``rounding to nearest'' where the result is rounded to the nearest floating point number. Using such rounding the previous example would result in:

math6290

  Another form of rounding is ``upward rounding'' where the result is rounded up to a larger floating point number. If the result is positive, it is rounded away from zero; if the result is negative, it is rounded towards zero. Using such rounding the previous example would result in:

math6302

Another form of rounding is ``downward rounding'' where the result is rounded down to a smaller floating point number. If the result is positive, it is rounded towards zero; if the result is negative, it is rounded away from zero. Using such rounding the previous example would result in:

math6313

Numerical libraries provide three forms of rounding:   tex2html_wrap_inline32417 , tex2html_wrap_inline32419 , and tex2html_wrap_inline32421 . The default mode of rounding is tex2html_wrap_inline32417 . When an explicit rounding mode is not specified, as was done earlier, tex2html_wrap_inline32417 is assumed.

Although IEEE 754 requires that the algebraic operators +, -, tex2html_wrap_inline32275 , tex2html_wrap_inline32277 , and tex2html_wrap_inline32435 are rounded to the nearest floating point number, other operators are not so favoured. The following example will illustrate what can happen with operators whose results are not guaranteed to be accurate to within one ULP (Unit in the Last Place). With a tex2html_wrap_inline32437 implementation that is guaranteed to be accurate to within 40 ULPS the following may occur:

math6326

The actual value, tex2html_wrap_inline32439 , is bracketed by tex2html_wrap_inline32441 and tex2html_wrap_inline32443 . These brackets may be widely separated; with our example sine implementation they may differ by up to 80 ULPS. The result using ``rounding to nearest'' only guarantees that the true result will fall within the bracketed region.

Using real numbers directly in computations is currently infeasible. Floating point numbers are commonly used because of their computational advantages. Unfortunately, rounding causes the result returned to be inexact.


next up previous notation contents
Next: 2.5.4 Algebraic Properties Up: 2.5 Floating Point Previous: 2.5.2 NAN
Jeff TupperMarch 1996