Dead code elimination

In compiler theory, dead code elimination (also known as dead code removal, dead code stripping, or dead code strip) is a compiler optimization to remove code which does not affect the program results. Removing such code has two benefits: it shrinks program size, an important consideration in some contexts, and it allows the running program to avoid executing irrelevant operations, which reduces its running time. Dead code includes code that can never be executed (unreachable code), and code that only affects dead variables, that is, variables that are irrelevant to the program.

Examples

Consider the following example written in C.

 int foo(void)
 {
   int a = 24;
   int b = 25; /* Assignment to dead variable */
   int c;
   c = a << 2;
   return c;
   b = 24; /* Unreachable code */
   return 0;
 }

Simple analysis of the uses of values would show that the value of b after the first assignment is not used inside foo. Furthermore, b is declared as a local variable inside foo, so its value cannot be used outside foo. Thus, the variable b is dead and an optimizer can reclaim its storage space and eliminate its initialization.

Furthermore, because the first return statement is executed unconditionally, no feasible execution path reaches the second assignment to b. Thus, the assignment is unreachable and can be removed. If the procedure had a more complex control flow, such as a label after the return statement and a goto elsewhere in the procedure, then a feasible execution path might exist to the assignment to b.

Also, even though some calculations are performed in the function, their values are not stored in locations accessible outside the scope of this function. Furthermore, given the function returns a static value (96), it may be simplified to the value it returns (this simplification is called constant folding).

Most advanced compilers have options to activate dead code elimination, sometimes at varying levels. A lower level might only remove instructions that cannot be executed. A higher level might also not reserve space for unused variables. Yet a higher level might determine instructions or functions that serve no purpose and eliminate them.

A common use of dead code elimination is as an alternative to optional code inclusion via a preprocessor. Consider the following code.

 int main(void) {
   int a = 5;
   int b = 6;
   int c;
   c = a * (b >> 1);
   if (0) {   /* DEBUG */
     printf("%d\n", c);
   }
   return c;
 }

Because the expression 0 will always evaluate to false, the code inside the if statement can never be executed, and dead code elimination would remove it entirely from the optimized program. This technique is common in debugging to optionally activate blocks of code; using an optimizer with dead code elimination eliminates the need for using a preprocessor to perform the same task.

In practice, much of the dead code that an optimizer finds is created by other transformations in the optimizer. For example, the classic techniques for operator strength reduction insert new computations into the code and render the older, more expensive computations dead.[1] Subsequent dead code elimination removes those calculations and completes the effect (without complicating the strength-reduction algorithm).

Historically, dead code elimination was performed using information derived from data-flow analysis.[2] An algorithm based on static single assignment form appears in the original journal article on SSA form by Cytron et al.[3] Shillingsburg improved on the algorithm and developed a companion algorithm for removing useless control-flow operations.[4]

Dynamic dead code elimination

Dead code is normally considered dead unconditionally. Therefore, it is reasonable attempting to remove dead code through dead code elimination at compile time.

However, in practice it is also common for code sections to represent dead or unreachable code only under certain conditions, which may not be known at the time of compilation. Such conditions may be imposed by different runtime environments (for example different versions of an operating system, or different sets and combinations of drivers or services loaded in a particular target environment), which may require different sets of special cases in the code, but at the same time become conditionally dead code for the other cases. Also, the software (for example, a driver or resident service) may be configurable to include or exclude certain features depending on user preferences, rendering unused code portions useless in a particular scenario. While modular software may be developed to dynamically load libraries on demand only, in most cases, it is not possible to load only the relevant routines from a particular library, and even if this would be supported, a routine may still include code sections which can be considered dead code in a given scenario, but could not be ruled out at compile time, already.

The techniques used to dynamically detect demand, identify and resolve dependencies, remove conditionally dead code, and recombine the remaining code at load or runtime are called dynamic dead code elimination.

Most computer languages, compilers and operating systems offer no or little more support than dynamic loading of libraries and late linking, therefore software utilizing dynamic dead code elimination is very rare.[5][6]

See also

References

  1. Allen, Frances; Cocke, John; Kennedy, Ken (1981). Muchnick and Jones, ed. Reduction of Operator Strength. In Program Flow Analysis. Prentice-Hall.
  2. Kennedy, Ken (1981). Muchnick and Jones, ed. A Survey of Data-flow Analysis Techniques. In Program Flow Analysis. Prentice-Hall.
  3. Cytron, Ron; Ferrante, Jeanne; Rosen, Barry; Zadeck, Ken (1991). Efficiently Computing Static Single Assignment Form and the Program Dependence Graph. ACM TOPLAS 13(4).
  4. Cooper, Keith D.; Torczon, Linda (2003). Engineering a Compiler. Morgan Kaufmann. pp. 498ff.
  5. Paul, Matthias; Frinke, Axel C. (1997-10-13) [first published 1991], FreeKEYB - Enhanced DOS keyboard and console driver (User Manual) (v6.5 ed.) (NB. FreeKEYB is a Unicode-based dynamically configurable successor of K3PLUS supporting most keyboard layouts, code pages, and country codes. Utilizing an off-the-shelf macro assembler as well as a framework of automatic pre- and post-processing analysis tools to generate dependency and code morphing meta data to be embedded into the executable file alongside the binary code and a self-discarding, relaxing and relocating loader, the driver implements byte-level granular dynamic dead code elimination and relocation techniques at load-time as well as self-modifying code and reconfigurability at run-time to minimize its memory footprint close to the canonical form depending on the hardware, operating system, and driver configuration as well as the selected feature set and locale (about sixty configuration switches with hundreds of options for an almost unlimited number of possible combinations). K3PLUS was an extended keyboard driver for DOS widely distributed in Germany at its time, with adaptations to a handful of other European languages available. It supported a sub-set of features already, but did not implement dynamic dead code eliminiation.)
  6. Paul, Matthias; Frinke, Axel C. (2006-01-16), FreeKEYB - Advanced international DOS keyboard and console driver (User Manual) (v7 preliminary ed.)

    Further reading

    External links