Fused multiply-add

From Wikipedia, the free encyclopedia

In computing, a fused multiply-add (FMA) computes a multiply-accumulate

FMA(A, B, C) = AB + C

with a single rounding of floating point numbers.

When implemented in a microprocessor this is typically faster than a multiply operation followed by an add. It also allows for getting the bottom half of the multiplication. E.g.,

  • H = FMA(A, B, 0.0)
  • L = FMA(A, B, −H)

This is implemented on the PowerPC and Itanium processor families. Because of this instruction there is no need for a hardware divide or square root unit since they can both be implemented using the FMA in software.

A fast FMA can speed up and improve the accuracy of many computations which involve the accumulation of products:

The FMA operation will likely be added to IEEE 754 in IEEE 754r.