Arithmetic shift

From Wikipedia, the free encyclopedia

A left arithmetic shift of a binary number by 1.  The empty position in the least significant bit is filled with a zero.  (See text for exceptions.)
Enlarge
A left arithmetic shift of a binary number by 1. The empty position in the least significant bit is filled with a zero. (See text for exceptions.)
A right arithmetic shift of a binary number by 1.  The empty position in the most significant bit is filled with a copy of the original MSB.
Enlarge
A right arithmetic shift of a binary number by 1. The empty position in the most significant bit is filled with a copy of the original MSB.
Arithmetic shift operators in various programming languages
Language Left Right
VHDL sla1 sra
Verilog <<< >>>2
Java <<3 >>
OpenVMS macro language @4

In computer programming, an arithmetic shift is a shift operator, sometimes known as a signed shift (albeit that it is not restricted to signed operands). For binary numbers it is is a bitwise operation that shifts all of the bits of its operand; every bit in the operand is simply moved a given number of bit positions, and the vacant bit-positions are filled in. What makes an arithmetic shift different from a logical shift is what the empty bit positions are filled with. Arithmetic shifts, in particular arithmetic right shifts, have several traps, and are often described erroneously as being equivalent to multiplication or division, a fact that has led to bugs in more than one compiler.

The formal definition of an arithmetic shift, from Federal Standard 1037C is that it is:

A shift, applied to the representation of a number in a fixed radix numeration system and in a fixed-point representation system, and in which only the characters representing the fixed-point part of the number are moved. An arithmetic shift is usually equivalent to multiplying the number by a positive or a negative integral power of the radix, except for the effect of any rounding; compare the logical shift with the arithmetic shift, especially in the case of floating-point representation.

An important word in the FS 1073C definition is "usually". Arithmetic left shifts are equivalent to multiplication by a (postive, integral) power of the radix (e.g. a multiplication by a power of 2 for binary numbers). Arithmetic left shifts are, with one exception, identical in effect to logical left shifts. The exception is the minor trap that arithmetic shifts may trigger arithmetic overflow whereas logical shifts do not. However, arithmetic right shifts are major traps for the unwary.

It is frequently stated that arithmetic right shifts are equivalent to division by a (positive, integral) power of the radix (e.g. a division by a power of 2 for binary numbers), and hence that division by a power of the radix can be optimized by implementing it as an arithmetic right shift. (A shifter is simpler than a divider. On many older processors, shift instructions would execute more quickly than division instructions.) Steele quotes a large number of 1960s and 1970s programming handbooks, manuals, and other specifications from companies and institutions such as DEC, IBM, Data General, and ANSI that make such statements. However, as Steele points out, they are all wrong.

Arithmetic right shifts are only equivalent to division by a power of the radix on an "N-1's-complement" machine (for radix "N"). Arithmetic shifts of binary numbers are only equivalent to division by a power of 2, when the one's complement representation of signed numbers is being used, for example.

This description has been erroneously brought over from the older one's complement architectures to newer two's complement architectures. But with two's complement binary number representations, arithmetic right shift is not equivalent to division by a power of 2. For negative numbers, the equivalence breaks down. The most trivial example of this is the arithmetic right shift of the number -1 (which is represented as all ones) in a two's complement representation, which yields -1.

The (1999) ISO standard for the C programming language defines the C language's right shift operator in terms of divisions by powers of 2. Because of the aforementioned non-equivalence, the standard explicitly excludes from that definition the right shifts of signed numbers that have negative values. It doesn't specify the behaviour of the right shift operator in such circumstances, but instead requires each individual C compiler to specify the behaviour of shifting negative values right.5

On an "N's-complement" architecture (for radix "N") an arithmetic shift is equivalent to a division that rounds towards negative infinity, not towards zero. Donald Knuth describes this in terms of a floor function.

[edit] Footnotes

  • Note 1: The VHDL arithmetic left shift operator is unusual. Instead of filling the LSB of the result with zero, it copies the original LSB into the new LSB. Whilst this is an exact mirror image of the arithmetic right shift, whereas the conventional definition of the operator is not, it is not the conventional definition of the operator, and is not equivalent to multiplication by a power of 2.
  • Note 2: The Verilog arithmetic right shift operator only actually performs an arithmetic shift if the first operand is signed. If the first operand is unsigned, the operator actually performs a logical right shift.
  • Note 3: Despite this being described as a "signed shift", this operator in Java is in fact a logical left shift operator. Java does not provide distinct arithmetic and logical left shift operators, and does not provide an arithmetic shift left capable of detecting integer overflow.
  • Note 4: In the OpenVMS macro language whether an arithmetic shift is a left or a right shift is determined by whether the second operand is positive or negative. This is unusual. In most programming languages the two directions have distinct operators, with the operator specifying the direction, and the second operand is implicitly positive. (Some languages, such as Verilog, require that negative values be converted to unsigned positive values. Some languages, such as C and C++, do not have defined behaviours if negative values are used.)
  • Note 5: The C standard was intended to not restrict the C language to either one's complement or two's complement architectures. In cases where the behaviours of one's complement and two's complement representations differ, such as this, the standard requires individual C compilers to document the actual behaviour of their target architectures.

[edit] References

This article contains material from the Federal Standard 1037C, which, as a work of the United States Government, is in the public domain.