Branch table

From Wikipedia, the free encyclopedia

In computer programming, a branch table (sometimes known as a jump table) is a term used to describe an efficient method of transferring program control (branching) to another part of a program using a table of branch instructions. The branch table construction is commonly used when programming in assembly language but may also be generated by a compiler.

A branch table consists of a serial list of unconditional branch instructions that is branched into using an offset created by multiplying a sequential index by the instruction length (the number of bytes in memory occupied by each branch instruction). It makes use of the fact that machine code instructions for branching have a fixed length and can be executed extremely efficiently by most hardware, and is most useful when dealing with raw data values that may be easily converted to sequential index values. Given such data, a branch table can be extremely efficient; it usually consists of the following steps: optionally validating the input data to ensure it is acceptable; transforming the data into an offset into the branch table, this usually involves multiplying or shifting it to take into account the instruction length; and branching to an address made up of the base of the table and the generated offset: this often involves an addition of the offset onto the program counter register.

Another method of implementing a branch table is with an array of addresses from which the required address is retrieved. This method can result in smaller code, and avoids the indirect jump. The method used to implement a branch table is usually based on the architecture of the processor on which the code is to be run.

Use of branch tables and other raw data encoding was common in the early days of computing when memory was expensive, CPUs were slower and compact data representation and efficient choice of alternatives were important. Nowadays, they are commonly used in embedded programming and operating system development. In many operating systems, both system calls and library functions may be referenced by an integer index into a branch table. This can improve compatibility with subsequent software versions: if the code of a function and the address of its entry point is changed, only the branch instruction in the branch table needs to be adjusted, application software compiled against the library or for the operating system does not need modification. In addition, calling functions by number (the index into the table), can be useful for some cases when programming. In addition, some computer architectures use branch tables for dispatching interrupts.

Advantages of branch tables include: algorithmic and code efficiency (data need only be encoded once and branch table code is usually compact), and the potential to attain high data compression ratios. For example, when compressing country names to country codes, a string such as "Central African Republic" can be compressed to a single index, resulting in large savings—particularly when the string appears many times. In addition, this same index can be used to access related data, reducing storage requirements further. Disadvantages include an extra level of indirection and restrictions in some programming languages.

[edit] Example

A simple example of branch table use in the 8-bit Microchip PIC assembly language is:

    movf    INDEX,W     ; move the index value into the W (working) register from the INDEX memory location
    addwf   PCL,F       ; add it onto the program counter register (PCL). each PIC instruction is one byte
                        ; so there is no need to perform any multiplication. most architectures will transform
                        ; the index in some way before adding it to the program counter

table                   ; the branch table begins here with this label
    goto    index_zero  ; each of these goto instructions is an unconditional branch to a different section
    goto    index_one   ; of code
    goto    index_two
    goto    index_three

index_zero
    ; code is added here to perform whatever action is required when INDEX was equal to zero
    return

index_one
...