Type punning

From Wikipedia, the free encyclopedia

In computer science, type punning is a common term for any programming technique that subverts or circumvents the type system of a programming language in order to achieve an effect that would be difficult or impossible to achieve within the bounds of the formal language.

In C and C++, constructs such as type conversion, union, and reinterpret_cast are provided in order to permit many kinds of type punning, although some kinds are not actually supported by the standard language. For example, reading from a different union member than the last one written invokes undefined behavior, but the effect in practice is usually to permit type punning. (See the floating-point example below.)

[edit] Sockets example

One classic example of type punning is found in the Berkeley sockets interface. The function to bind an opened but uninitialized socket to an IP address is declared as follows:

int bind(int sockfd, struct sockaddr *my_addr, socklen_t addrlen);

The bind function is usually called as follows:

struct sockaddr_in sa = {0};
int sockfd = ...;
sa.sin_family = AF_INET;
sa.sin_port = htons(port);
bind(sockfd, (struct sockaddr *)&sa, sizeof sa);

The Berkeley sockets library fundamentally relies on the fact that in C, a pointer to struct sockaddr_in is freely convertible to a pointer to struct sockaddr; and, in addition, that the two structure types share the same memory layout. Therefore, a reference to the structure field my_addr->sin_family (where my_addr is of type struct sockaddr*) will actually refer to the field sa.sin_family (where sa is of type struct sockaddr_in). In other words, the sockets library uses type punning to implement a rudimentary form of inheritance.

[edit] Floating-point example

Not all examples of type punning involve structures, as the previous example did. Suppose we want to determine whether a floating-point number is negative. We could write:

bool is_negative(float x) {
    return (x < 0.0)? 1: 0;
}

However, supposing that floating-point comparisons are expensive, and also supposing that float is represented according to the IEEE floating-point standard, and integers are 32 bits wide, we could engage in type punning to extract the sign bit of the floating-point number using only integer operations:

bool is_negative(float x) {
    unsigned int *ui = (unsigned int *)&x;
    return (*ui >> 31);
}

Although most programming style guides frown on any kind of type punning, this kind of type punning is more dangerous than most. Whereas the former relied only on guarantees made by the C programming language about structure layout and pointer convertibility, this example relies on assumptions about a system's particular hardware. Some situations, such as time-critical code that the compiler otherwise fails to optimize, may require dangerous code. In these cases, documenting all such assumptions in comments helps to keep the code maintainable.

Even more dangerous than the punning implementation above is the following implementation, which tries to use C's union data type to retrieve the integer representation of x. In this case, the code is not just relying on the hardware representation of float, but also relying on the compiler not to optimize away the write to my_union.d.

bool is_negative(float x) {
    union {
        unsigned int ui;
        float d;
    } my_union;
    my_union.d = x;
    return (my_union.ui >> 31);
}


For another example of type punning, see Stride of an array.

[edit] External links

  • Section of the GCC manual on -fstrict-aliasing, which defeats some type punning
  • Defect Report 257 to the C99 standard, incidentally defining "type punning" in terms of union, and discussing the issues surrounding the implementation-defined behavior of the last example above