Instruction scheduling
From Wikipedia, the free encyclopedia
In computer science, instruction scheduling is a compiler optimization used to improve instruction-level parallelism, which improves performance on machines with instruction pipelines. Put more simply, without changing the meaning of the code, it tries to
- Avoid pipeline stalls by rearranging the order of instructions.
- Avoid illegal or semantically ambiguous operations (typically involving subtle instruction pipeline timing issues or non-interlocked resources.)
The pipeline stalls can be caused by structural hazards (processor resource limit), data hazards (output of one instruction needed by another instruction) and control hazards (branching).
Contents |
[edit] Data hazards
Instruction scheduling is typically done on a single basic block. In order to determine whether rearranging the block's instructions in a certain way preserves the behavior of that block, we need the concept of a data dependency. There are three types of dependencies, which also happen to be the three data hazards:
- Read after Write (RAW or "True"): Instruction 1 writes a value used later by Instruction 2. Instruction 1 must come first, or Instruction 2 will read the old value instead of the new.
- Write after Read (WAR or "Anti"): Instruction 1 reads a location that is later overwritten by Instruction 2. Instruction 1 must come first, or it will read the new value instead of the old.
- Write after Write (WAW or "Output"): Two instructions both write the same location. They must occur in their original order.
- Read after Read (RAR or "Input"): Both instructions read the same location. Input dependence does not constrain the execution order of two statements, but it is usefull in scalar replacement of array elements.
To make sure we respect these three types of dependencies, we construct a dependency graph, which is a directed graph where each vertex is an instruction and there is an edge from I1 to I2 if I1 must come before I2 due to a dependency. Then, any topological sort of this graph is a valid instruction schedule.
[edit] The phase order of Instruction Scheduling
Instruction scheduling may be done either before or after register allocation or both before and after it. The advantage of doing it before register allocation is that this results in maximum parallelism. The disadvantage of doing it before register allocation is that this can result in the register allocator needing to use a number of register exceeding those available. This will cause spill/fill code to be introduced which will reduce the performance of the section of code in question.
If the architecture being scheduled has instruction sequences that have potentially illegal combinations (due to a lack of instruction interlocks) the instructions must be scheduled after register allocation. This second scheduling pass will also improve the placement of the spill/fill code.
If scheduling is only done after register allocation then there will be false dependencies introduced by the register allocation that will limit the amount of instruction motion possible by the scheduler.
[edit] Types of Instruction Scheduling
There are several types of instruction scheduling:
- Basic Block Scheduling: instructions can't move across basic block boundaries.
- Global scheduling: instructions can move across basic block boundaries.
- Modulo Scheduling: another name for software pipelining, which is a form of instruction scheduling if properly implemented.