Short-circuit evaluation
Evaluation strategies |
---|
Short-circuit evaluation, minimal evaluation, or McCarthy evaluation is the semantics of some Boolean operators in some programming languages in which the second argument is executed or evaluated only if the first argument does not suffice to determine the value of the expression: when the first argument of the AND
function evaluates to false
, the overall value must be false
; and when the first argument of the OR
function evaluates to true
, the overall value must be true
. In some programming languages (Lisp), the usual Boolean operators are short-circuit. In others (Java, Ada), both short-circuit and standard Boolean operators are available. For some Boolean operations, like XOR, it is not possible to short-circuit, because both operands are always required to determine the result.
The short-circuit expression x Sand y
(using Sand
to denote the short-circuit variety) is equivalent to the conditional expression if x then y else false
; the expression x Sor y
is equivalent to if x then true else y
.
Short-circuit operators are, in effect, control structures rather than simple arithmetic operators, as they are not strict. In imperative language terms (notably C and C++), where side effects are important, short-circuit operators introduce a sequence point – they completely evaluate the first argument, including any side effects, before (optionally) processing the second argument. ALGOL 68 used "proceduring" to achieve user defined short-circuit operators & procedures.
In loosely typed languages that have more than the two truth-values True
and False
, short-circuit operators may return the last evaluated subexpression, so that x Sor y
and x Sand y
are actually equivalent to if x then x else y
and if x then y else x
respectively (without actually evaluating x
twice). This is called "Last value" in the table below.
In languages that use lazy evaluation by default (like Haskell), all functions are effectively "short-circuit", and special short-circuit operators are unnecessary.
The use of short-circuit operators has been criticized as problematic:
The conditional connectives — "cand" and "cor" for short — are... less innocent than they might seem at first sight. For instance, cor does not distribute over cand: compare
- (A cand B) cor C with (A cor C) cand (B cor C);
in the case ¬A ∧ C , the second expression requires B to be defined, the first one does not. Because the conditional connectives thus complicate the formal reasoning about programs, they are better avoided.[1]
Support in common programming languages
Language | Eager operators | Short-circuit operators | Result type |
---|---|---|---|
ABAP | none | and , or |
Boolean1 |
Ada | and , or |
and then , or else |
Boolean |
ALGOL 68 | and, &, ∧ ; or, ∨ | andf , orf (both user defined) | Boolean |
C, Objective-C | none | && , || , ? [2] |
int (&& ,|| ), opnd-dependent (? ) |
C++2 | & , | |
&& , || , ? [3] |
Boolean (&& ,|| ), opnd-dependent (? ) |
D8 | & , | |
&& , || , ? |
Boolean (&& ,|| ), opnd-dependent (? ) |
Go, OCaml, Haskell | none | && , || |
Boolean |
C# | & , | |
&& , || , ? , ?? |
Boolean (&& ,|| ), opnd-dependent (? , ?? ) |
Java, Julia, MATLAB, R, Swift | & , | |
&& , || |
Boolean |
ColdFusion | none | AND , OR , && , || |
Boolean |
Eiffel | and , or |
and then , or else |
Boolean |
Erlang | and , or |
andalso , orelse |
Boolean |
Fortran3 | .and. , .or. |
.and. , .or. |
Boolean |
JavaScript | & , | |
&& , || |
Last value |
Lisp, Lua, Scheme | none | and , or |
Last value |
Lasso | none | and , or , && , || |
Last value |
M / MUMPS | & , ! |
none | Numeric |
Modula-2 | none | AND , OR |
Boolean |
Oberon | none | & , OR |
Boolean |
Pascal | and , or 4 |
and_then , or_else 5 |
Boolean |
Perl, Ruby | & , | |
&& , and , || , or |
Last value |
PHP | & , | |
&& , and , || , or |
Boolean |
Python | & , | |
and , or |
Last value |
Smalltalk | & , | |
and: , or: 6 |
Boolean |
Standard ML | Unknown | andalso , orelse |
Boolean |
Visual Basic .NET | And , Or |
AndAlso , OrElse |
Boolean |
VB Script, VB Classic, VBA | And , Or |
Select Case 7 |
Numeric |
1 ABAP does not actually have a distinct boolean type.
2 When overloaded, the operators && and || are eager and can return any type.
3 Fortran operators are neither short-circuit nor eager: the language specification allows the compiler to select the method for optimization.
4 ISO Pascal allows but does not require short-circuiting.
5 ISO-10206 Extended Pascal supports and_then
and or_else
.[4]
6 Smalltalk uses short-circuit semantics as long as the argument to and:
is a block (e.g. false and: [Transcript show: 'Wont see me']
).
7 BASIC languages that supported CASE statements did so by using the conditional evaluation system, rather than as jump tables limited to fixed labels.
8 This only applies to runtime-evaluated expressions, static if
and static assert
. Expressions in static initializers or manifest constants use eager evaluation.
Common usage
Avoiding undesired side effects of the second argument
Usual example, using a C-based language:
int denom = 0;
if (denom != 0 && num / denom)
{
... // ensures that calculating num/denom never results in divide-by-zero error
}
Consider the following example:
int a = 0;
if (a != 0 && myfunc(b))
{
do_something();
}
In this example, short-circuit evaluation guarantees that myfunc(b)
is never called. This is because a != 0
evaluates to false. This feature permits two useful programming constructs. Firstly, if the first sub-expression checks whether an expensive computation is needed and the check evaluates to false, one can eliminate expensive computation in the second argument. Secondly, it permits a construct where the first expression guarantees a condition without which the second expression may cause a run-time error. Both are illustrated in the following C snippet where minimal evaluation prevents both null pointer dereference and excess memory fetches:
bool is_first_char_valid_alpha_unsafe(const char *p)
{
return isalpha(p[0]); // SEGFAULT highly possible with p == NULL
}
bool is_first_char_valid_alpha(const char *p)
{
return p != NULL && isalpha(p[0]); // a) no unneeded isalpha() execution with p == NULL, b) no SEGFAULT risk
}
Possible problems
Untested second condition leads to unperformed side effect
Despite these benefits, minimal evaluation may cause problems for programmers who do not realize (or forget) it is happening. For example, in the code
if (expressionA && myfunc(b)) {
do_something();
}
if myfunc(b)
is supposed to perform some required operation regardless of whether do_something()
is executed, such as allocating system resources, and expressionA
evaluates as false, then myfunc(b)
will not execute, which could cause problems. Some programming languages, such as Java, have two operators, one that employs minimal evaluation and one that does not, to avoid this problem.
Problems with unperformed side effect statements can be easily solved with proper programming style, i.e. not using side effects in boolean statements, as using values with side effects in evaluations tends to generally make the code opaque and error-prone.[5]
Since minimal evaluation is part of an operator's semantic definition and not an (optional) optimization, many coding styles rely on it as a succinct (if idiomatic) conditional construct, such as these Perl idioms:
some_condition or die; # Abort execution if some_condition is false
some_condition and die; # Abort execution if some_condition is true
Code efficiency
If both expressions used as conditions are simple boolean variables, it can be actually faster to evaluate both conditions used in boolean operation at once, as it always requires a single calculation cycle, as opposed to one or two cycles used in short-circuit evaluation (depending on the value of the first). The difference in terms of computing efficiency between these two cases depends heavily on compiler and optimization scheme used; with proper optimization they will execute at the same speed, as they will get compiled to identical machine code.[6]
Short-circuiting can lead to errors in branch prediction on modern processors, and dramatically reduce performance (a notable example is highly optimized ray with axis aligned box intersection code in ray tracing). Some compilers can detect such cases and emit faster code, but programming language semantics may constrain such optimizations.
References
- ↑ Edsger W. Dijkstra, "On a somewhat disappointing correspondence", EWD1009-0, 25 May 1987 full text
- ↑ ISO/IEC 9899 standard, section 6.5.13
- ↑ ISO/IEC IS 14882 draft.
- ↑ "and_then - The GNU Pascal Manual". Gnu-pascal.de. Retrieved 2013-08-24.
- ↑ "Referential Transparency, Definiteness and Unfoldability" (PDF). Itu.dk. Retrieved 2013-08-24.
- ↑ "Software optimization resources. C++ and assembly. Windows, Linux, BSD, Mac OS X". Agner.org. Retrieved 2013-08-24.