Extended precision

From Wikipedia, the free encyclopedia

The term "extended precision" refers to storage formats for floating point numbers taking advantage of an opportunity not falling in to a regular sequence of single, double and quadruple precision such as 32-bit, 64-bit and 128-bit occupying two, four or eight 16-bit storage words or similar accountancy. By contrast arbitrary-precision arithmetic refers to implementations of much larger numeric types with a storage count that usually is not a power of two.

The most common extended-precision format is currently the 80-bit format first used in the Intel 8087 math coprocessor (and later in the Motorola 68881), which has since become ubiquitous on the x86 architecture. This 80-bit format was standardized by the IEEE floating-point standard and uses 64 bits for the significand, 15 bits for the exponent field and one bit for the sign of the significand. The exponent field has an offset of 16383 (that is exponent value = exponent field - 16383), and the exponent field value of 32767 (all fifteen bits 1) is reserved so as to enable the representation of special states such as Infinity and Not a Number. For historical reasons, this format has no implicit/hidden bit: the explicit bit was used in the Intel 8087 to suppress the normalization of subnormal numbers in certain cases.

The opportunity arises because on the one hand, a number format should fit into a whole number of storage words but must pack in both the mantissa and exponent, and on the other the hardware effecting the arithmetic almost certainly will also be used for integer arithmetic where there is no exponent part. Thus the 64-bit double precision format allows 53 bits for the mantissa and 11 for the exponent but the hardware also allows 64-bit integer arithmetic. With trivial additional circuitry it could therefore be generalised to perform floating-point arithmetic on a mantissa of 64 bits, and this additional precision might be useful. But that leaves no room for an exponent so an additional word of storage is taken (16 bits), for a total of 80 bits or five words. Subtle differences in the behaviour of the arithmetic (in this case) are a side effect.

80-bit floating point hardware was introduced well after the development of C and is not supported in as many platforms as smaller formats. As a result, despite it being supported in many cases, 32 and 64-bit floating point are more commonly used. (On the x86 architecture, most C compilers support 80-bit extended precision via the long double type.)

Older machines used a variety of formats. On an IBM 1130, extended precision referred to a floating-point format that offered a 32-bit significand or mantissa (corresponding to the 32-bit two's complement integer arithmetic of the cpu) for a floating-point number, an extension on the normal 24-bit significand of its standard 32-bit floating-point format. Floating-point arithmetic operations were effected by software, and double precision was not supported at all. The extended format occupied three 16-bit words, with the extra space simply ignored.

[edit] See also