Primitive type
From Wikipedia, the free encyclopedia
This article does not cite any references or sources. (May 2008) Please help improve this article by adding citations to reliable sources. Unverifiable material may be challenged and removed. |
In computer science, primitive types — as distinct from composite types — are data types provided by a programming language as basic building blocks. Primitive types are also known as built-in types or basic types.
Depending on the language and its implementation, primitive types may or may not have a one-to-one correspondence with objects in the computer's memory. However, one usually expects operations on primitive types to be the fastest language constructs there are. Integer addition, for example, can be performed as a single machine instruction, and some processors offer specific instructions to process sequences of characters with a single instruction. In particular, the C standard mentions that "a 'plain' int object has the natural size suggested by the architecture of the execution environment". This means that int
is likely to be 32 bits long on a 32-bit architecture. Values of the primitive types do not share state[clarify].
Most languages do not allow the behaviour or capabilities of primitive types to be modified by programs. Exceptions include Smalltalk, which permits primitive datatypes to be extended within a program, adding to the operations that can be performed on them or even redefining the built-in operations.
Contents |
[edit] Overview
The actual range of primitive types that is available is dependent upon the specific programming language that is being used. For example, in C, strings are a composite data type, whereas in modern dialects of Basic and in JavaScript, they are a primitive data type.
Typical primitive types may include:
- Character (
character
,char
); - Integer (
integer
,int
,short
,long
,byte
) with a variety of precisions; - Floating-point number (
float
,double
,real
,double precision
); - Fixed-point number (
fixed
) with a variety of precisions and a programmer-selected scale. - Boolean having the values true and false.
- Reference (also called a pointer or handle), a small value referring to another object's address in memory, possibly a much larger one.
More sophisticated types which can be primitive include:
- Tuples in ML, Python
- Linked lists in Lisp
- Complex numbers in Fortran, C (C99), Lisp, Python
- Rational numbers in Lisp
- Hash tables in various guises, in Lisp, Python, Lua
- First class functions, closures, continuations in Functional programming languages such as Lisp and ML
[edit] Specific primitive types
[edit] Integer numbers
An integer number can hold a whole number, but no fraction. Integers may be either signed (allowing negative values) or unsigned (nonnegative values only). Typical sizes of integers are:
Size | Names | Signed Range | Unsigned Range |
---|---|---|---|
8 bits | Byte | -128 to +127 | 0 to 255 |
16 bits | Word, short int | -32,768 to +32,767 | 0 to 65 535 |
32 bits | Double Word, long int | -2,147,483,648 to +2,147,483,647 | 0 to 4,294,967,295 |
64 bits | long long (C), long (Java) | –9,223,372,036,854,775,808 to +9,223,372,036,854,775,807 | 0 to 18,446,744,073,709,551,615 |
unlimited | Bignum |
Literals for integers consist of a sequence of digits. Most programming languages disallow the use of commas for digit grouping, although FORTRAN (77 and Fortran 90 and above fixed form source but not free form source) allows embedded spaces, and Perl, Ruby, and D allow embedded underscores. Negation is indicated by a minus sign (-) before the value. Examples of integer literals are:
- 42
- 10000
- -233000
[edit] Booleans
A boolean type, typically denoted "bool" or "boolean", is a single-bit type that can be either "true" (1) or "false" (0). In some languages (e.g., C++), bools may be implicitly converted to integers (for example, "true + true" is a valid expression equal to 2), but other languages (e.g., Java and Pascal) disallow this.
[edit] Floating-point numbers
A floating-point number represents a real number that may have a fractional part. These numbers are stored internally in scientific notation, typically in binary but sometimes in decimal. Because floating-point numbers have only a limited number of digits, most values can be represented only approximately.
Many languages have both a single precision (often called "float") and a double precision type.
Literals for floating point numbers include a decimal point, and typically use "e" to denote scientific notation. Examples of floating-point literals are:
- 20.0005
- 99.9
- -5000.12
- 6.02e23
Some languages (e.g., FORTRAN) also have a complex number type comprising two floating-point numbers: a real part and an imaginary part.
[edit] Fixed-point numbers
A fixed-point number represents a real number that may have a fractional part. These numbers are stored internally in a scaled-integer form, typically in binary but sometimes in decimal. Because fixed-point numbers have only a limited number of digits, most values can be represented only approximately. Because fixed-point numbers have a limited range of values, the programmer must be carefull to avoid overflow in intermediate calculations as well as the final results.
[edit] Characters and strings
A character type (typically called "char") may contain a single letter, digit, punctuation mark, or control character. Some languages have two character types, a single-byte type for ASCII characters and a multi-byte type for Unicode characters.
Characters may be combined into strings. The string data can include numbers and other numerical symbols but will be treated as text.
In most languages, a string is equivalent to an array of characters, but Java treats them as distinct types. Other languages (such as Python, and many dialects of BASIC) have no separate character type, but only strings with a length of one.
Literals for characters and strings are usually surrounded by quotation marks: often, single quotes (') are used for characters and double quotes (") are used for strings.
Examples of character literals in C syntax are:
- 'A'
- '4'
- '$'
- '\t' (tab character)
Examples of string literals in C syntax are:
- "A"
- "Hello World"
- "I am 6000 years old"
[edit] Numeric data type ranges
Each numeric data type has a maximum and minimum value known as the range. Attempting to store a number outside the range may lead to compiler/runtime errors, or to incorrect calculations (due to truncation) depending on the language being used.
The range of a variable is based on the number of bytes used to save the value, and an integer data type is usually[1] able to store 2n values (where n is the number of bits). For other data types (e.g. floating point values) the range is more complicated and will vary depending on the method used to store it. There are also some types that do not use entire bytes, e.g. a boolean that requires a single bit, and represents a binary value (although in practice a byte is often used, with the remaining 7 bits being redundant). Some programming languages (such as Ada and Pascal) also allow the opposite direction, that is, the programmer defines the range and precision needed to solve a given problem and the compiler chooses the most appropriate integer or floating point type automatically.
[edit] References
- ^ There are situations where one or more bits are reserved for other functions, e.g. parity checking.