Use-define chain

From Wikipedia, the free encyclopedia

A Use-Definition Chain (UD Chain) is a data structure that consists of a use, U, of a variable and all the definitions, D, of that variable that can reach that use without any other intervening definitions. A definition can have many forms, but is generally taken to mean the assignment of some value to a variable (which is different from the use of the term that refers to the language construct involving a data type and allocating storage).

A counterpart of a UD Chain is a Definition-Use Chain (DU Chain), which consists of a definition, D, of a variable and all the uses, U, reachable from that definition without any other intervening definitions.

Both UD and DU chains are created by using a form of static code analysis known as data flow analysis. Knowing the use-def and def-use chains for a program or subprogram is a prerequisite for many compiler optimizations, including constant propagation and common subexpression elimination.

Contents

[edit] Purpose

Making the use-define or define-use chains is a step in liveness analysis, so that logical representations of all the variables can be identified and tracked through the code.

Consider the following snippet of code:

int x = 0;    /* A */
x = x + y;    /* B */
/* 1, some uses of x */
x = 35;       /* C */
/* 2, some more uses of x */

Notice that x is assigned a value at three points (marked A, B, and C). However, at the point marked "1", the use-def chain for x should indicate that its current value must have come from line B (and its value at line B must have come from line A). Contrariwise, at the point marked "2", the use-def chain for x indicates that its current value must have come from line C. Since the value of the x in block 2 does not depend on any definitions in block 1 or earlier, x might as well be a different variable there; practically speaking, it is a different variable — call it x2.

int x = 0;    /* A */
x = x + y;    /* B */
/* 1, some uses of x */
int x2 = 35;  /* C */
/* 2, some uses of x2 */

The process of splitting x into two separate variables is called live range splitting.

[edit] Setup

The list of statements determines a strong order among statements.

  • Statements are labeled using the following conventions: s(i), where i is an integer in [1,n]; and n is the number of statements in the basic block
  • Variables are identified in italic (e.g., v,u and t)
  • Every variable is assumed to have a definition in the context or scope. (In static single assignment form, use-define chains are explicit because each chain contains a single element.)

For a variable, such as v, its declaration is identified as V (italic capital letter), and for short, its declaration is identified as s(0). In general, a declaration of a variable can be in an outer scope (e.g., a global variable).

[edit] Definition of a Variable

When a variable, v, is on the LHS of an assignment statement, such as s(j), then s(j) is a definition of v. Every variable (v) has at least one definition by its declaration (V) (or initialization).

[edit] Use of a Variable

If variable, v, is on the RHS of statement s(j), there is a statement, s(i) with i < j and min(j-i), that it is a definition of v and it has a use at s(j) (or, in short, when a variable, v, is on the RHS of a statement s(j), then v has a use at statement s(j)).

[edit] Execution

Consider the sequential execution of the list of statements, s(i), and what can now be observed as the computation at statement, j:

  • A definition at statement s(i) with i < j is alive at j, if it has a use at a statement s(k) with k ≥ j. The set of alive definitions is defined at statement i as A(i) and the number of alive definitions is denoted as |A(i)|. (A(i) is a simple but powerful concept: theoretical and practical results in space complexity theory, access complexity (I/O complexity), register allocation and cache locality exploitation are based on A(i).)
  • A definition at statement s(i) kills all previous definitions (s(k) with k < i) for the same variables.

[edit] Method of building a use-def (or ud) chain

  1. Set definitions in statement s(0)
  2. For each i in [1,n], find live definitions that have use in statement s(i)
  3. Make a link among definitions and uses
  4. Set the statement s(i), as definition statement
  5. Kill previous definitions

With this algorithm, two things are accomplished:

  1. A directed acyclic graph (DAG) is created on the variable uses and definitions. The DAG specifies a data dependency among assignment statements, as well as a partial order (therefore parallelism among statements).
  2. When statement s(i) is reached, there is a list of live variable assignments. If only one assignment is live, for example, constant propagation might be used.