A Use-Definition Chain (UD Chain) is a data structure that consists of a use, U, of a variable, and all the definitions, D, of that variable that can reach that use without any other intervening definitions. A definition can have many forms, but is generally taken to mean the assignment of some value to a variable (which is different from the use of the term that refers to the language construct involving a data type and allocating storage).
A counterpart of a UD Chain is a Definition-Use Chain (DU Chain), which consists of a definition, D, of a variable and all the uses, U, reachable from that definition without any other intervening definitions.
Both UD and DU chains are created by using a form of static code analysis known as data flow analysis. Knowing the use-def and def-use chains for a program or subprogram is a prerequisite for many compiler optimizations, including constant propagation and common subexpression elimination.
Contents |
Making the use-define or define-use chains is a step in liveness analysis, so that logical representations of all the variables can be identified and tracked through the code.
Consider the following snippet of code:
int x = 0; /* A */
x = x + y; /* B */
/* 1, some uses of x */
x = 35; /* C */
/* 2, some more uses of x */
Notice that x
is assigned a value at three points (marked A, B, and C). However, at the point marked "1", the use-def chain for x
should indicate that its current value must have come from line B (and its value at line B must have come from line A). Contrariwise, at the point marked "2", the use-def chain for x
indicates that its current value must have come from line C. Since the value of the x
in block 2 does not depend on any definitions in block 1 or earlier, x
might as well be a different variable there; practically speaking, it is a different variable — call it x2
.
int x = 0; /* A */
x = x + y; /* B */
/* 1, some uses of x */
int x2 = 35; /* C */
/* 2, some uses of x2 */
The process of splitting x
into two separate variables is called live range splitting. See also static single assignment form.
The list of statements determines a strong order among statements.
For a variable, such as v, its declaration is identified as V (italic capital letter), and for short, its declaration is identified as s(0). In general, a declaration of a variable can be in an outer scope (e.g., a global variable).
When a variable, v, is on the LHS of an assignment statement, such as s(j), then s(j) is a definition of v. Every variable (v) has at least one definition by its declaration (V) (or initialization).
If variable, v, is on the RHS of statement s(j), there is a statement, s(i) with i < j and min(j-i), that it is a definition of v and it has a use at s(j) (or, in short, when a variable, v, is on the RHS of a statement s(j), then v has a use at statement s(j)).
Consider the sequential execution of the list of statements, s(i), and what can now be observed as the computation at statement, j:
This example is based on a java algorithm for finding the ggt/gcd (it is not important to understand, what function the ggt / this code represents)
int ggt(int a, int b){
int c = a;
int d = b;
if(c == 0)
return d;
while(d != 0){
if(c > d)
c = c - d;
else
d = d - c;
}
return c;
}
To find out all def-use-chains for variable d, do the following steps:
1.Search for the first time, the variable is defined (write access).
In this case it is "d=b" (l.3)
2.Search for the first time, the variable is read.
In ths case it is "return d"
3.Write down this information in the following style:
[name of the variable you are creating a def-use-chain for, the concrete write access, the concrete read access]
In this case it is:
[d, d=b, return d]
Repeat this steps in the following style:
Combine each write access with each read access (but NOT the other way round)
The result should be:
You have to take care, if the variable is changed by the time.
For example: From line 3 down to line 9, "d" is not redefined / changed.
At line 10, "d" could be redefined, this is, why you have to recombine this write access on "d" with all possible read access, which could be reached.
In this case, only the code beyond line 6 is relevant. Line 3 for example cannot be reached again.
For your understanding, you can imagine 2 different variables "d":
With this algorithm, two things are accomplished: