Aliasing (computing)

From Wikipedia, the free encyclopedia

In computing, aliasing describes a situation in which a data location in memory can be accessed through different symbolic names in the program. Thus, modifying the data through one name implicitly modifies the values associated to all aliased names, which may not be expected by the programmer. As a result, aliasing makes it particularly difficult to understand, analyze and optimize programs. Aliasing analyses intend to compute useful information for understanding aliasing in programs.

Contents

[edit] Examples

[edit] Array bounds checking

For example, the C programming language does not perform array bounds checking. One can then exploit the implementation of the programming language by the compiler, plus the computer architecture's assembly language conventions, to achieve aliasing effects.

If an array is created on the stack, with a variable laid out in memory directly beside that array, one could index outside that array and then directly change that variable by changing the relevant array element. For example, if we have an int array of size ten (for this example's sake, calling it vector), next to another int variable (call it i), vector[10] would be aliased to i if they are adjacent in memory.

This is possible in some implementations of C because an array is in reality a pointer to some location in memory, and array elements are merely offsets off that memory location. Since C has no bounds checking, indexing and addressing outside of the array is possible. Note that the aforementioned aliasing behaviour is implementation specific. Some implementations may leave space between arrays and variables on the stack, for instance, to minimize possible aliasing effects. C programming language specifications do not specify how data is to be laid out in memory (ISO/IEC 9899:1999, section 6.2.6.1).

It is not erroneous for a compiler to omit aliasing effects for accesses that fall outside the bounds of an array, as having a pointer even pointing out of range invokes undefined behavior.

[edit] Aliased pointers

Another variety of aliasing can occur in any language that can refer to one location in memory with more than one name (for example, with pointers). See the C example of the xor swap algorithm that is a function; it assumes the two pointers passed to it are distinct, but if they are in fact equal (or aliases of each other), the function fails. This is a common problem with functions that accept pointer arguments, and their tolerance (or the lack thereof) for aliasing must be carefully documented, particularly for functions that perform complex manipulations on memory areas passed to them.

[edit] Specified aliasing

Controlled aliasing behaviour may be desirable in some cases (that is, aliasing behaviour that is specified, unlike that relevant to memory layout in C). It is common practice in Fortran. The Perl programming language specifies, in some constructs, aliasing behaviour, such as in foreach loops. This allows certain data structures to be modified directly with less code. For example,

my @array = (1, 2, 3);

foreach my $element (@array) {
   # Increment $element, thus automatically
   # modifying @array, since $element is aliased
   # to each of @array's elements in turn.
   $element++;
}

print "@array \n";

will print out "2 3 4" as a result. If one wanted to bypass aliasing effects, one could copy the contents of the index variable into another and change the copy.

[edit] Conflicts with optimization

Many times optimizers have to make conservative assumptions about variables in the presence of pointers. For example, a constant propagation process which knows that the value of variable x is 5 would not be able to keep using this information after an assignment to another variable (for example, *y = 10) because it could be that *y is an alias of x. This could be the case after an assignment like y = &x.

As an effect of the assignment to *y, the value of x would be changed as well, so propagating the information that x is 5 to the statements following *y = 10 would be potentially wrong (if *y is indeed an alias of x). However, if we have information about pointers, the constant propagation process could make a query like: can x be an alias of *y? Then, if the answer is no, x = 5 can be propagated safely.

Another optimization that is impacted by aliasing is code reordering; if the compiler decides that x is not an alias of *y, then code that uses or changes the value of x can be moved before the assignment *y = 10, if this would improve scheduling or enable more loop optimizations to be carried out.

In order to enable such optimizations to be carried out in a predictable manner, the ISO standard for the C programming language (including its newer C99 edition) specifies that it is illegal (with some exceptions) for pointers of different types to reference the same memory location. This rule, known as "strict aliasing", allows impressive increases in performance[citation needed], but has been known to break some otherwise valid code. Several software projects intentionally violate this portion of the C99 standard. For example, Python does so in order to implement reference counting[1]. and the Linux kernel does so because it causes problems with optimization of inlined code[2]. In such cases, when compiled with gcc, the option -fno-strict-aliasing is invoked in order to prevent unwanted or invalid optimizations that could produce incorrect code.

[edit] External links

[edit] See also

  • See aliasing for uses of the word when applied to signal processing, including computer graphics.

[edit] Notes

Languages