Undefined behavior

From Wikipedia, the free encyclopedia

In computer science, undefined behavior is a feature of some programming languages — most famously C. In these languages, to simplify the specification and allow some flexibility in implementation, the specification leaves the results of certain operations specifically undefined.

For example, in C the use of any variable before it has been initialized yields undefined behavior, as do division by zero and indexing an array outside of its defined bounds (see Buffer overflow). This specifically frees the compiler to do whatever is easiest or most efficient, should such a program be entered. In general, any behavior whatsoever is valid after a situation resulting in undefined behavior has been encountered. In particular, it is never required that the compiler diagnose undefined behavior — therefore, programs invoking undefined behavior may appear to compile and even run without errors at first, only to fail on another system, or even on another date. When an instance of undefined behavior occurs, so far as the language specification is concerned anything could happen, ranging from a "reasonable" action all the way to a system crash.

On the other hand, in some languages (including C), even the compiler is not bound to behave in a sensible manner once undefined behavior has been invoked. One famous piece of hacker lore is the behavior of version 1.34 of the GCC C compiler when given a program containing the #pragma directive, which has an implementation-defined behavior according to the C standard. (It should be noted here that "implementation-defined" is more restrictive than "undefined", requiring the implementation to document what it does.) In practice, many C implementations recognize, for example, #pragma once as a rough equivalent of include guards — but GCC, upon finding a #pragma directive, would instead attempt to start Emacs running a simulation of the Towers of Hanoi. Failing that, it would try to launch other commonly distributed Unix games such as NetHack and Rogue.

Under some circumstances there can be specific restrictions on undefined behavior. For example, the instruction set architecture of a CPU might leave the behavior of some forms of an instruction undefined, but if the CPU supports memory protection then the architecture specification will probably include a blanket rule stating that no user-accessible instruction may cause a hole in the operating system's security; so an implementation of the architecture would be permitted to corrupt all user registers in response to such an instruction but would not be allowed to, for example, switch into supervisor mode.

[edit] See also

[edit] External links

[edit] References