Integer literal
In computer science, an integer literal is an integer whose value is directly represented in source code. For example, in the assignment statement x = 1
, the string 1
is an integer literal indicating the value 1, while in the statement x = 0x10
the string 0x10
is an integer literal indicating the value 16, which is represented by 10
in hexadecimal (indicated by the 0x
prefix).
By contrast, in x = cos(0)
, the expression cos(0)
evaluates to 1 (as the cosine of 0), but the value 1 is not literally included in the source code. More simply, in x = 2 + 2,
the expression 2 + 2
evaluates to 4, but the value 4 is not literally included. Further, in x = "1"
the "1"
is a string literal, not an integer literal, because it is in quotes. The value of the string is 1
, which happens to be an integer string, but this is semantic analysis of the string literal – at the syntactic level "1"
is simply a string, no different from "foo"
.
Parsing
Recognizing a string (sequence of characters in the source code) as an integer literal is part of the lexical analysis (lexing) phase, while evaluating the literal to its value is part of the semantic analysis phase. Within the lexer and phrase grammar, the token class is often denoted integer
, with the lowercase indicating a lexical-level token class, as opposed to phrase-level production rule (such as ListOfIntegers
). Once a string has been lexed (tokenized) as an integer literal, its value cannot be determined syntactically (it is just an integer), and evaluation of its value becomes a semantic question.
Integer literals are generally lexed with regular expressions, as in Python.[1]
Evaluation
As with other literals, integer literals are generally evaluated at compile time, as part of the semantic analysis phase. In some cases this semantic analysis is done in the lexer, immediately on recognition of an integer literal, while in other cases this is deferred until the parsing stage, or until after the parse tree has been completely constructed. For example, on recognizing the string 0x10
the lexer could immediately evaluate this to 16 and store that (a token of type integer
and value 16), or defer evaluation and instead record a token of type integer
and value 0x10
.
Once literals have been evaluated, further semantic analysis in the form of constant folding is possible, meaning that literal expressions involving literal values can be evaluated at the compile phase. For example, in the statement x = 2 + 2
after the literals have been evaluated and the expression 2 + 2
has been parsed, it can then be evaluated to 4, though the value 4 does not itself appear as a literal.
Affixes
Integer literals frequently have prefixes indicating base, and less frequently suffixes indicating type.[1] For example, 0x10ULL
indicates the value 16 (because hexadecimal) as an unsigned long long integer in C++.
Common prefixes include:
-
0x
or0X
for hexadecimal (base 16); -
0o
or0O
for octal (base 8); -
0b
or0B
for binary (base 2).
Common suffixes include:
-
l
orL
for long integer; -
ll
orLL
for long long integer; -
u
orU
for unsigned integer.
These affixes are somewhat similar to sigils, though sigils attach to identifiers (names), not literals.