Strong typing
In computer science and computer programming, a type system is said to feature strong typing when it specifies one or more restrictions on how operations involving values of different data types can be intermixed. The opposite of strong typing is weak typing.
Interpretation
Most generally, "strong typing" implies that the programming language places severe restrictions on the intermixing that is permitted to occur, preventing the compiling or running of source code which uses data in what is considered to be an invalid way. For instance, an addition operation may not be used with an integer and string values; a procedure which operates upon linked lists may not be used upon numbers. However, the nature and strength of these restrictions is highly variable.
Example
|
Weak Typing |
Strong Typing |
Pseudocode |
a = 2
b = "2"
concatenate(a, b) # Returns "22"
add(a, b) # Returns 4
|
a = 2
b = "2"
concatenate(a, b) # Type Error
add(a, b) # Type Error
concatenate(str(a), b) # Returns "22"
add(a, int(b)) # Returns 4
|
Languages |
Perl, PHP, Rexx, JavaScript, BASIC |
Java, C, C++, Python, C#, Vala |
Meanings in computer literature
Some of the factors which writers have qualified as "strong typing" include:
- Absence of unchecked run-time type errors. This definition comes from Luca Cardelli's article Typeful Programming. [1] In other writing, the absence of unchecked run-time errors is referred to as safety or type safety; Tony Hoare's early papers call this property security.
- Strong guarantees about the run-time behavior of a program before program execution, whether provided by static analysis, the execution semantics of the language or another mechanism.
- Type safety; that is, at compile or run time, the rejection of operations or function calls which attempt to disregard data types. In a more rigorous setting, type safety is proved about a formal language by proving progress and preservation.
- The guarantee that a well-defined error or exceptional behavior (as opposed to an undefined behavior) occurs as soon as a type-matching failure happens at runtime, or, as a special case of that with even stronger constraints, the guarantee that type-matching failures would never happen at runtime (which would also satisfy the constraint of "no undefined behavior" after type-matching failures, since the latter would never happen anyway).
- The mandatory requirement, by a language definition, of compile-time checks for type constraint violations. That is, the compiler ensures that operations only occur on operand types that are valid for the operation. However, that is also the definition of static typing, leading some experts to state: "Static typing is often confused with StrongTyping". [2]
- Fixed and invariable typing of data objects. The type of a given data object does not vary over that object's lifetime. For example, class instances may not have their class altered.
- The absence of ways to evade the type system. Such evasions are possible in languages that allow programmer access to the underlying representation of values, i.e., their bit-pattern.
- Omission of implicit type conversion, that is, conversions that are inserted by the compiler on the programmer's behalf. For these authors, a programming language is strongly typed if type conversions are allowed only when an explicit notation, often called a cast, is used to indicate the desire of converting one type to another.
- Disallowing any kind of type conversion. Values of one type cannot be converted to another type, explicitly or implicitly.
- A complex, fine-grained type system with compound types.
- Brian Kernighan: "[..]each object in a program has a well-defined type which implicitly defines the legal values of and operations on the object. The language guarantees that it will prohibit illegal values and operations, by some mixture of compile- and run-time checking."[3]
Variation across programming languages
Note that some of these definitions are contradictory, others are merely orthogonal, and still others are special cases (with additional constraints) of other, more "liberal" (less strong) definitions. Because of the wide divergence among these definitions, it is possible to defend claims about most programming languages that they are either strongly or weakly typed. For instance:
- Java, C#, Pascal, Ada and C require all variables to have a defined type and support the use of explicit casts of arithmetic values to other arithmetic types. Java, C#, Ada and Pascal are often said to be more strongly typed than C, a claim that is probably based on the fact that C supports more kinds of implicit conversions, and C also allows pointer values to be explicitly cast while Java and Pascal do not. Java itself may be considered more strongly typed than Pascal as manners of evading the static type system in Java are controlled by the Java Virtual Machine's dynamic type system. C# is similar to Java in that respect, though it allows disabling dynamic type checking by explicitly putting code segments in an "unsafe context". Pascal's type system has been described as "too strong", because the size of an array or string is part of its type, making some programming tasks very difficult.[4][5]
- The object-oriented programming languages Smalltalk, Ruby, Python, and Self are all "strongly typed" in the sense that typing errors are prevented at runtime and they do little implicit type conversion, but these languages make no use of static type checking: the compiler does not check or enforce type constraint rules. The term duck typing is now used to describe the dynamic typing paradigm used by the languages in this group.
- The Lisp family of languages are all "strongly typed" in the sense that typing errors are prevented at runtime. Some Lisp dialects like Common Lisp do support various forms of type declarations[6] and some compilers (CMUCL[7] and related) use these declarations together with type inference to enable various optimizations and also limited forms of compile time type checks.
- Standard ML, Objective Caml and Haskell have purely static type systems, in which the compiler automatically infers a precise type for all values. These languages (along with most functional languages) are considered to have stronger type systems than Java, as they permit no implicit type conversions. While OCaml's libraries allow one form of evasion (Object magic), this feature remains unused in most applications.
- Visual Basic is a hybrid language. In addition to including statically typed variables, it includes a "Variant" data type that can store data of any type. Its implicit casts are fairly liberal where, for example, one can sum string variants and pass the result into an integer literal.
- Assembly language and Forth have been said to be untyped. There is no type checking; it is up to the programmer to ensure that data given to functions is of the appropriate type. Any type conversion required is explicit.
For this reason, writers who wish to write unambiguously about type systems often eschew the term "strong typing" in favor of specific expressions such as "type safety".
See also
References
- ^ ftp://gatekeeper.research.compaq.com/pub/DEC/SRC/research-reports/SRC-045.pdf page 3
- ^ Cunningham & Cunningham Wiki
- ^ Brian Kernighan in Why Pascal is not my favourite language
- ^ Infoworld April 25th 1983
- ^ Brian Kernighan: Why Pascal is not my favourite language
- ^ Common Lisp HyperSpec, Types and Classes
- ^ CMUCL User's Manual: The Compiler, Types in Python