next up previous notation contents
Next: 2.5.1 Infinity Up: 2 Numbers Previous: 2.4 Complex Numbers

2.5 Floating Point

 

Floating point numbers are commonly used to approximate real numbers. Floating point facilities are common in computer hardware so most floating point operations can be performed very quickly on computers.

There are many different floating point number systems [5, 49, 50, 35], although they are all very similar. A floating point number can be written as:

math6239

where a,b, and c are all in a finite subdomain of the integers.

All of the numbers in a particular floating point number system can be specified with a single choice of b. The set of floating point numbers with b=2 is denoted by tex2html_wrap_inline32323 . tex2html_wrap_inline32323 is the system of choice for computer implementations since a and c are usually stored in binary.

Implementations usually represent a and c in a fixed number of bits. A common example is IEEE 754 [5] 64-bit double precision where a is stored in 53 bits (fifty-two bits for the magnititude, one for the sign) while c is stored in 11 bits (using biased binary representation). Such a system is compactly expressed as tex2html_wrap_inline32339 : two exponent values are reserved to indicate non-normalized numbers. The floating point operations described below are required in IEEE 754 compliant numerical libraries.

Formally, the system tex2html_wrap_inline32341 includes all numbers which may be expressed as tex2html_wrap_inline32343 and satisfy:

math6246

where a and c are integers. The subtraction present in the right conjunct shifts the ``decimal place'' so as to relate the exponent range with unity, rather than tex2html_wrap_inline32349 .

Another view of the floating point numbers is to imagine the numbers of tex2html_wrap_inline32341 as being described by A base b digits multiplied by b raised to an exponent between m and M:

math6253

Both describe the same system of numbers. The former description builds upon the preceding number systems while the latter gels with one's common experience of performing calculations. The relation between tex2html_wrap_inline32363 and tex2html_wrap_inline32341 is clearer; as are other important floating point concepts, such as the distinction between normalized numbers, where tex2html_wrap_inline32367 , and denormalized numbers, where tex2html_wrap_inline32369 .

Throughout this presentation the exact details of the underlying floating point system will not be important so tex2html_wrap_inline32371 will be used to denote any particular floating point system. The exact format used to store floating point numbers does not concern us. The meticulous reader is encouraged to read one of [x,y,z] for details omitted in this brief exposé of floating point. We use tex2html_wrap_inline32373 for numerical examples.


next up previous notation contents
Next: 2.5.1 Infinity Up: 2 Numbers Previous: 2.4 Complex Numbers
Jeff TupperMarch 1996