Branch predication

Not to be confused with branch prediction.

Branch predication is a strategy in computer architecture design for mitigating the costs usually associated with conditional branches, particularly branches to short sections of code. It does this by allowing each instruction to conditionally either perform an operation or do nothing.[1]

Overview

Most computer programs contain conditional code, which will be executed only under specific conditions depending on factors that cannot be determined beforehand, for example depending on user input. As the majority of processors simply execute the next instruction in a sequence, the traditional solution is to insert branch instructions that allow a program to conditionally branch to a different section of code, thus changing the next step in the sequence. This was sufficient until designers began improving performance by implementing instruction pipelining, a method which is slowed down by branches. For a more thorough description of the problems which arose, and a popular solution, see branch predictor.

Luckily, one of the more common patterns of code that normally relies on branching has a more elegant solution. Consider the following pseudocode:[1]

if condition
    do this
else
    do that

On a system that uses conditional branching, this might translate to machine instructions looking similar to:[1]

branch if condition to label 1
   do that
  branch to label 2
 label 1:
   do this
 label 2:
  ...

With branch predication, all possible branch paths are coded inline, but some instructions execute while others do not. The basic idea is that each instruction is associated with a predicate (the word here used similarly to its usage in predicate logic) and that the instruction will only be executed if the predicate is true. The machine code for the above example using branch predication might look something like this:[1]

(condition) do this
(not condition) do that

Note that beside eliminating branches, less code is needed in total, provided the architecture provides predicated instructions. While this does not guarantee faster execution in general, it will if the do this and do that blocks of code are short enough.

Typically, in order to claim a system has branch predication, most or all of the instructions must have this ability to execute conditionally based on a predicate.

Advantages

The main purpose of predication is to avoid jumps over very small sections of program code, increasing the effectiveness of pipelined execution and avoiding problems with the cache. It also has a number of more subtle benefits:

Disadvantages

Predication's primary drawback is in increased encoding space. In typical implementations, every instruction reserves a bitfield for the predicate specifying under what conditions that instruction should have an effect. When available memory is limited, as on embedded devices, this space cost can be prohibitive. However, some architectures such as Thumb-2 are able to avoid this issue (see below). Other detriments are the following:[2]

Predication is most effective when paths are balanced or when the longest path is the most frequently executed,[2] but determining such a path is very difficult at compile time, even in the presence of profiling information.

History

Predicated instructions were popular in European computer designs of the 1950s, including the Mailüfterl (1955), the Zuse Z22 (1955), the ZEBRA (1958), and the Electrologica X1 (1958). The IBM ACS-1 design of 1967 allocated a "skip" bit in its instruction formats, and the CDC Flexible Processor in 1976 allocated three conditional execution bits in its microinstruction formats.

In Intel's IA-64 architecture, almost every instruction in the IA-64 instruction set is predicated. The predicates themselves are stored in special purpose registers; one of the predicate registers is always true so that unpredicated instructions are simply instructions predicated with the value true. The use of predication is essential in the IA-64 implementation of software pipelining because it avoids the need for writing separated code for prologs and epilogs.

In x86-64 architectures from both Intel and AMD, branch predication may be performed through the use of conditional move operations (cmove): a source operand is conditionally moved to the destination operand depending on the value of a flag register.

In the 32-bit ARM architecture, almost all instructions can be conditionally executed. Thirteen different predicates are available, each depending on the four flags Carry, Overflow, Zero, and Negative in some way. The ARM's 16-bit Thumb instruction set has no branch predication, in order to save encoding space, but its successor Thumb-2 overcomes this problem using a special instruction which has no effect other than to supply predicates for the next four instructions. The 64-bit version of the ARM architecture does not support branch predication.

See also

References

  1. 1 2 3 4 Rick Vinyard (2000-04-26). "Predication". cs.nmsu.edu. Retrieved 2014-04-22.
  2. 1 2 Joseph A. Fisher, Paolo Faraboschi, Cliff Young (2004) Embedded Computing - A VLIW Approach to Architecture, Compilers, and Tools. Page 172.

Further reading

This article is issued from Wikipedia - version of the Sunday, January 10, 2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.