Type polymorphism

From Wikipedia, the free encyclopedia

This article is about using the same code for different datatypes. For other uses of the term "polymorphism", see polymorphism.

In computer science, polymorphism means allowing a single definition to be used with different types of data (specifically, different classes of objects). For instance, a polymorphic function definition can replace several type-specific ones, and a single polymorphic operator can act in expressions of various types. Many programming languages and paradigms implement some forms of polymorphism; for a popular example, see polymorphism in object-oriented programming.

The concept of polymorphism applies to data types in addition to functions. A function that can evaluate to and be applied to values of different types is known as a polymorphic function. A data type that contains elements of different types is known as a polymorphic data type.

There are two fundamentally different kinds of polymorphism, as first informally described by Christopher Strachey in 1967. If the range of actual types that can be used is finite and the combinations must be specified individually prior to use, it is called ad-hoc polymorphism. If all code is written without mention of any specific type and thus can be used transparently with any number of new types, it is called parametric polymorphism. In their formal treatment of the topic in 1985, Luca Cardelli and Peter Wegner later restricted the term parametric polymorphism to instances with type parameters, recognizing also other kinds of universal polymorphism.

Programming using parametric polymorphism is called generic programming, particularly in the object-oriented community. Advocates of object-oriented programming often cite polymorphism as one of the major benefits of that paradigm over others. Advocates of functional programming reject this claim on the grounds that the notion of parametric polymorphism is so deeply ingrained in many statically typed functional programming languages that most programmers simply take it for granted. However, the rise in popularity of object-oriented programming languages did contribute greatly to awareness and use of polymorphism in the mainstream programming community.

Contents

[edit] Parametric polymorphism

Using parametric polymorphism, a function or a data type can be written generically so that it can deal equally well with any objects without depending on their type. For example, a function append that joins two lists can be constructed so that it does not care about the type of elements: it can append lists of integers, lists of real numbers, lists of strings, and so on. Let the type variable a denote the type of elements in the lists. Then append can be typed [a] × [a] → [a], where [a] denotes a list of elements of type a. We say that the type of append is parameterized by a for all values of a. (Note that since there is only one type variable, the function cannot be applied to just any pair of lists: the pair, as well as the result list, must consist of the same type of elements.) For each place where append is applied, a value is decided for a.

Parametric polymorphism was first introduced to programming languages in ML in 1976. Today it exists in Standard ML, O'Caml, Haskell, Visual Prolog and others. Some argue that templates should be considered an example of parametric polymorphism, though instead of actually producing generic code, the implementations generate specific code for each type value of a that a function is used with.

Parametric polymorphism is a way to make a language more expressive, while still maintaining full static type-safety. It is thus irrelevant in dynamically typed languages, since by definition they lack static type-safety. However, any dynamically typed function f that takes n arguments can be given a static type using parametric polymorphism: f : p1 × ... × pn → r, where p1, ..., pn and r are type parameters. Of course, this type is completely insubstantial and thus essentially useless. Instead, the types of arguments and return value are observed in run-time to match the operations performed on them.

Cardelli and Wegner recognized in 1985 the advantages of allowing bounds on the type parameters. Many operations require some knowledge of the data types but can otherwise work parametrically. For example, to check if an item is included in a list, we need to compare the items for equality. In Standard ML, type parameters of the form ’’a are restricted so that the equality operation is available, thus the function would have the type ’’a × ’’a list → bool and ’’a can only be a type with defined equality. In Haskell, bounding is achieved by requiring types to belong to a type class; thus the same function has the type Eq a a → [a] → Bool in Haskell. In most object-oriented programming languages that support parametric polymorphism, parameters can be constrained to be subtypes of a given type (see #Subtyping polymorphism below and the article on Generic programming).

[edit] Predicative and impredicative polymorphism

Type systems with parametric polymorphism can be classified into predicative and impredicative systems. The key difference is in how parametric values may be instantiated. For example, consider the append function described above, which has type [a] × [a] → [a]; in order to apply this function to a pair of lists, a type must be substituted for the variable a in the type of the function such that the type of the arguments matches up with the resulting function type. In an impredicative system, the type being substituted may be any type whatsoever, including a type that is itself polymorphic; thus append can be applied to pairs of lists with elements of any type -- even to lists of polymorphic functions such as append itself. In a predicative system, type variables may not be instantiated with polymorphic types. This restriction makes the distinction between polymorphic and non-polymorphic types very important; thus in predicative systems polymorphic types are sometimes referred to as type schemas to distinguish them from ordinary (monomorphic) types, which are sometimes called monotypes.

Polymorphism in the language ML and its close relatives is predicative. This is because predicativity, together with other restrictions, makes the type system simple enough that type inference is possible. In languages where explicit type annotations are necessary when applying a polymorphic function, the predicativity restriction is less important; thus these languages are generally impredicative. Haskell manages to achieve type inference without predicativity but with a few complications.

In type theory, the most frequently studied impredicative typed λ-calculi are based on those of the lambda cube, especially System F. Predicative type theories include Martin-Löf Type Theory and NuPRL.

[edit] Subtyping polymorphism

Some languages employ the idea of subtypes to restrict the range of types that can be used in a particular case of parametric polymorphism. In these languages, subtyping polymorphism (sometimes referred to as dynamic polymorphism or dynamic typing) allows a function to be written to take an object of a certain type T, but also work correctly if passed an object that belongs to a type S that is a subtype of T (according to the Liskov substitution principle). This type relation is sometimes written S <: T. Conversely, T is said to be a supertype of S—written T :> S.

For example, if Number, Rational, and Integer are types such that Number :> Rational and Number :> Integer, a function written to take a Number will work equally well when passed an Integer or Rational as when passed a Number. The actual type of the object can be hidden from clients into a black box, and accessed via object identity. In fact, if the Number type is abstract, it may not even be possible to get your hands on an object whose most-derived type is Number (see abstract data type, abstract class). This particular kind of type hierarchy is known—especially in the context of the Scheme programming language—as a numerical tower, and usually contains a lot more types.

Object-oriented programming languages offer subtyping polymorphism using subclassing (also known as inheritance). In typical implementations, each class contains what is called a virtual table—a table of functions that implement the polymorphic part of the class interface—and each object contains a pointer to the "vtable" of its class, which is then consulted whenever a polymorphic method is called. This mechanism is an example of

  • late binding, because virtual function calls are not bound until the time of invocation, and
  • single dispatch (i.e., single-argument polymorphism), because virtual function calls are bound simply by looking through the vtable provided by the first argument (the this object), so the runtime types of the other arguments are completely irrelevant.

The same goes for most other popular object systems. Some, however, such as CLOS, provide multiple dispatch, under which method calls are polymorphic in all arguments.

[edit] Ad-hoc polymorphism

Ad-hoc polymorphism usually refers to overloading, but sometimes automatic type conversion, known as coercion, is also considered to be a kind of ad-hoc polymorphism (see the example section below). Common to these two types is the fact that the programmer has to specify exactly what types are to be usable with the polymorphic function.

The name refers to the manner in which this kind of polymorphism is typically introduced: "Oh, hey, let's make the + operator work on strings, too!" Some argue that ad-hoc polymorphism is not polymorphism in a meaningful computer science sense at all—that it is just syntactic sugar for calling append_integer, append_string, etc., manually. One way to see it is that

  • to the user, there appears to be only one function, but one that takes different types of input and is thus type polymorphic; on the other hand,
  • to the author, there are several functions that need to be written—one for each type of input—so there's essentially no polymorphism.

In other words, ad-hoc polymorphism is a dispatch mechanism: code moving through one named function is dispatched to various other functions without having to specify the exact function being called.

[edit] Overloading

Overloading allows multiple functions taking different types to be defined with the same name; the compiler or interpreter automatically calls the right one. This way, functions appending lists of integers, lists of strings, lists of real numbers, and so on could be written, and all be called append—and the right append function would be called based on the type of lists being appended. This differs from parametric polymorphism, in which the function would need to be written generically, to work with any kind of list. Using overloading, it is possible to have a function perform two completely different things based on the type of input passed to it; this is not possible with parametric polymorphism. Another way to look at overloading is that a routine is uniquely identified not by its name, but by the combination of its name and the number, order and types of its parameters.

This type of polymorphism is common in object-oriented programming languages, many of which allow operators to be overloaded in a manner similar to functions (see operator overloading). It is also used extensively in the purely functional programming language Haskell in the form of type classes. Many languages lacking ad-hoc polymorphism suffer from long-winded names such as print_int, print_string, etc. (see C, Objective Caml).

An advantage that is sometimes gained from overloading is the appearance of specialization, e.g., a function with the same name can be implemented in multiple different ways, each optimized for the particular data types that it operates on. This can provide a convenient interface for code that needs to be specialized to multiple situations for performance reasons.

Since overloading is done at compile time, it is not a substitute for late binding as found in subtyping polymorphism.

[edit] Coercion

Main article: type conversion

Most languages provide mechanisms (collectively called type conversion) that programmers can use to convert, or translate, values of one type into semantically analogous values of another type. For instance, it is usually possible to convert a value of a floating-point numeric type into a value of integer type with a rounding operation, or to convert an integer to a string using a function that constructs the integer's decimal representation. In some languages, certain of these conversions are performed automatically when a value of one type appears in a context that demands a value of a different type. This feature is called coercion. It is also called implicit type conversion, to reflect the understanding that a programmer who writes such an expression is "implicitly" asking that the value he provides be converted to the required type before use.

The C programming language provides a typical example. C implicitly performs any type conversions that do not "lose information": for instance, when an expression of type int appears in a context that requires a value of type double, the compiler inserts a conversion operation to turn the integer into a double-precision floating-point value. The same applies to "widening" coercions such as from int to long. Conversions that may not preserve the meaning of the original value, such as from double to int, are not performed implicitly and must be specified by the programmer. Some languages, notably C++ and C#, allow programmers to specify implicit conversion operations of their own in addition to those built into the language; in these cases it is the programmer's responsibility to ensure that such conversions are well-behaved in any appropriate sense.

In statically typed languages, the need to coerce a value is evident at compile time, and the exact nature of the coercion is statically determined. In dynamically typed languages, the need to convert a value to a different type is generally not discovered until the moment when the conversion must be performed, and only then can it be known from what type it must be converted and what must be done to convert it.

Strictly speaking, coercion is not polymorphism, since operations do not need to act on values of more than one type if their operands are implicitly converted beforehand. However, it is tempting for programmers to perceive coercion as a weakened form of overloading, and indeed, in some cases the distinction between the two is difficult to draw and not very useful, especially in languages that support both. For instance, does C have one addition operator per numeric type and perform implicit conversion, or does it have a separate addition operator for every combination of operand types? This is not a particularly useful distinction. (It is unarguable that C supports implicit conversion, however: a programmer is allowed to write a function expecting a parameter of type double and call that function with an integer argument.) In general, however, coercion is a strictly weaker construct than overloading: coercion only affects the way a function or operator can be applied, while overloading allows the meaning of a name or symbol to vary depending on the context.

Coercion is also related to subtyping, and indeed it is possible to consider σ to be a subtype of τ if there exists an implicit conversion from σ to τ, although this form of subtyping is qualitatively different from the kind of subtyping in which every member of σ is a member of τ. See Subtype for details.

[edit] Example

This example aims to illustrate three different kinds of polymorphism described in this article. Though overloading an originally arithmetic operator to do a wide variety of things in this way may not be the most clear-cut example, it allows some subtle points to be made. In practice, the different types of polymorphism are not generally mixed up as much as they are here.

Imagine, if you will, an operator + that may be used in the following ways:

  1. 1 + 2 = 3
  2. 3.14 + 0.0015 = 3.1415
  3. 1 + 3.7 = 4.7
  4. [1, 2, 3] + [4, 5, 6] = [1, 2, 3, 4, 5, 6]
  5. [true, false] + [false, true] = [true, false, false, true]
  6. "foo" + "bar" = "foobar"

[edit] Overloading

To handle these six function calls, four different pieces of code are needed—or three, if strings are considered to be lists of characters:

  • In the first case, integer addition must be invoked.
  • In the second and third cases, floating-point addition must be invoked.
  • In the fourth and fifth cases, list concatenation must be invoked.
  • In the last case, string concatenation must be invoked, unless this too is handled as list concatenation (as in, e.g., Haskell).

Thus, the name + actually refers to three or four completely different functions. This is an example of overloading.

[edit] Coercion

As we've seen, there's one function for adding two integers and one for adding two floating-point numbers in this hypothetical programming environment, but note that there is no function for adding an integer to a floating-point number (as in case 3 above). The reason why we can still do this is that when the compiler/interpreter finds a function call f(a1, a2, ...) that no existing function named f can handle, it starts to look for ways to convert the arguments into different types in order to make the call conform to the signature of one of the functions named f. This is called coercion. Both coercion and overloading are kinds of ad-hoc polymorphism.

In our case, since any integer can be converted into a floating-point number without loss of precision, 1 is converted into 1.0 and floating-point addition is invoked. There was really only one reasonable outcome in this case, because a floating-point number cannot generally be converted into an integer, so integer addition could not have been used; but significantly more complex, subtle, and ambiguous situations can occur in, e.g., C++.

[edit] Parametric polymorphism

Finally, the reason why we can concatenate both lists of integers, lists of booleans, and lists of characters, is that the function for list concatenation was written without any regard to the type of elements stored in the lists. This is an example of parametric polymorphism. If you wanted to, you could make up a thousand different new types of lists, and the generic list concatenation function would happily and without requiring any augmentation accept instances of them all.

It can be argued, however, that this polymorphism is not really a property of the function per se; that if the function is polymorphic, it is due to the fact that the list data type is polymorphic. This is true—to an extent, at least—but it is important to note that the function could just as well have been defined to take as a second argument an element to append to the list, instead of another list to concatenate to the first. If this were the case, the function would indisputably be parametrically polymorphic, because it could then not know anything about its second argument, except that the type of the element should match the type of the elements of the list.

[edit] Other meanings

Polymorphism in computer science can refer also to polymorphic code, that is code that mutates while keeping the original algorithm intact. See article for details.

[edit] See also

[edit] References

  • Luca Cardelli, Peter Wegner. On Understanding Types, Data Abstraction, and Polymorphism, from Computing Surveys, (December, 1985) [1]
  • Philip Wadler, Stephen Blott. How to make ad-hoc polymorphism less ad hoc, from Proc. 16th ACM Symposium on Principles of Programming Languages, (January, 1989) [2]
  • Christopher Strachey. Fundamental Concepts in Programming Languages, from Higher-Order and Symbolic Computation, (April, 2000) [3]
  • Paul Hudak, John Peterson, Joseph Fasel. A Gentle Introduction to Haskell Version 98. [4]