Evaluation strategy

From Wikipedia, the free encyclopedia

A programming language uses an evaluation strategy to determine when to evaluate the argument(s) of a function call (for function, also read: operation, method, or relation) and what kind of value to pass to the function. For example, call-by-worth/pass-by-reference specifies that a function application evaluates the argument before it proceeds to the evaluation of the function's body and that it passes two capabilities to the function, namely, the ability to look up the current value of the argument and to modify it via an assignment statement.[1] The notion of reduction strategy in lambda calculus is similar but distinct.

In practical terms, many modern programming languages have converged on a call-by-worth, pass-by-reference strategy for function calls (C#, Java). Some older languages, especially unsafe languages such as C++, combine several notions of parameter passing. Historically, call-by-value and call-by-name date back to Algol 60, a language designed in the late 1950s. Only purely functional languages such as Clean and Haskell use by-need.

Strict evaluation

In strict evaluation, the arguments to a function are always evaluated completely before the function is applied.

Under Church encoding, eager evaluation of operators maps to strict evaluation of functions; for this reason, strict evaluation is sometimes called "eager". Most existing programming languages use strict evaluation for functions.

Applicative order

Applicative order (or leftmost innermost[2][3]) evaluation refers to an evaluation strategy in which the arguments of a function are evaluated from left to right in a post-order traversal of reducible expressions (redexes). Unlike call-by-value, applicative order evaluation reduces terms within a function body as much as possible before the function is applied.

Call by value

Call-by-value evaluation is the most common evaluation strategy, used in languages as different as C and Scheme. In call-by-value, the argument expression is evaluated, and the resulting value is bound to the corresponding variable in the function (frequently by copying the value into a new memory region). If the function or procedure is able to assign values to its parameters, only its local copy is assigned — that is, anything passed into a function call is unchanged in the caller's scope when the function returns.

Call-by-value is not a single evaluation strategy, but rather the family of evaluation strategies in which a function's argument is evaluated before being passed to the function. While many programming languages (such as Common Lisp, Eiffel and Java) that use call-by-value evaluate function arguments left-to-right, some evaluate functions and their arguments right-to-left, and others (such as Scheme, OCaml and C) leave the order unspecified.

Implicit limitations

In some cases, the term "call-by-value" is problematic, as the value which is passed is not the value of the variable as understood by the ordinary meaning of value, but an implementation-specific reference to the value. The effect is that what syntactically looks like call-by-value may end up rather behaving like call-by-reference or call-by-sharing, often depending on very subtle aspects of the language semantics.

The reason for passing a reference is often that the language technically does not provide a value representation of complicated data, but instead represents them as a data structure while preserving some semblance of value appearance in the source code. Exactly where the boundary is drawn between proper values and data structures masquerading as such is often hard to predict. In C, a vector (of which strings are special cases) is a data structure and thus treated as a reference to a memory area, but a struct is a value even if it has fields that are vectors. In Maple, a vector is a special case of a table and therefore a data structure, but a list (which gets rendered and can be indexed in exactly the same way) is a value. In Tcl, values are "dual-ported" such that the value representation is used at the script level, and the language itself manages the corresponding data structure, if one is required. Modifications made via the data structure are reflected back to the value representation, and vice-versa.

The description "call-by-value where the value is a reference" is common (but should not be understood as being call-by-reference); another term is call-by-sharing. Thus the behaviour of call-by-value Java or Visual Basic and call-by-value C or Pascal are significantly different: in C or Pascal, calling a function with a large structure as an argument will cause the entire structure to be copied (except if it's actually a reference to a structure), potentially causing serious performance degradation, and mutations to the structure are invisible to the caller. However, in Java or Visual Basic only the reference to the structure is copied, which is fast, and mutations to the structure are visible to the caller.

Call by reference

In call-by-reference evaluation (also referred to as pass-by-reference), a function receives an implicit reference to a variable used as argument, rather than a copy of its value. This typically means that the function can modify (i.e. assign to) the variable used as argument—something that will be seen by its caller. Call-by-reference can therefore be used to provide an additional channel of communication between the called function and the calling function. The same effect can be emulated in languages like C by passing a pointer (not to be confused with call-by-reference), or in languages like Java by passing a holding object, that can be set by the caller. A call-by-reference language makes it more difficult for a programmer to track the effects of a function call, and may introduce subtle bugs.

Many languages support call-by-reference in some form or another, but comparatively few use it as a default, e.g. Perl. A few languages, such as C++, PHP, Visual Basic .NET, C# and REALbasic, default to call-by-value, but offer special syntax for call-by-reference parameters. C++ additionally offers call-by-reference-to-const. In purely functional languages there is typically no semantic difference between the two strategies (since their data structures are immutable, so there is no possibility for a function to modify any of its arguments), so they are typically described as call-by-value even though implementations frequently use call-by-reference internally for the efficiency benefits.

Even among languages that don't exactly support call-by-reference, many, including C and ML, support explicit references (objects that refer to other objects), such as pointers (objects representing the memory addresses of other objects), and these can be used to effect or simulate call-by-reference (but with the complication that a function's caller must explicitly generate the reference to supply as an argument).

Example that demonstrates call-by-reference in E:

 def modify(var p, &q) {
     p := 27 # passed by value - only the local parameter is modified
     q := 27 # passed by reference - variable used in call is modified
 }
 
 ? var a := 1
 # value: 1
 ? var b := 2
 # value: 2
 ? modify(a,&b)
 ? a
 # value: 1
 ? b
 # value: 27

Example that simulates call-by-reference in C:

void Modify(int p, int * q, int * o)
{
    p = 27; // passed by value - only the local parameter is modified
    *q = 27; // passed by value or reference, check call site to determine which
    *o = 27; // passed by value or reference, check call site to determine which
}
int main()
{
    int a = 1;
    int b = 1;
    int x = 1;
    int * c = &x;
    Modify(a, &b, c);   // a is passed by value, b is passed by reference by creating a pointer,
                        // c is a pointer passed by value
    // b and x are changed
    return(0);
}

Call by sharing

Also known as "call by object" or "call by object-sharing" is an evaluation strategy first named by Barbara Liskov et al. for the language CLU in 1974.[4] It is used by languages such as Python,[5] Iota, Java (for object references),[6] Ruby, Scheme, OCaml, AppleScript, and many other languages. However, the term "call by sharing" is not in common use; the terminology is inconsistent across different sources. For example, in the Java community, they say that Java is pass-by-value, whereas in the Ruby community, they say that Ruby is pass-by-reference[citation needed], even though the two languages exhibit the same semantics. Call by sharing implies that values in the language are based on objects rather than primitive types.

The semantics of call by sharing differ from call by reference in that assignments to function arguments within the function aren't visible to the caller (unlike by reference semantics) [citation needed], so e.g. if a variable was passed, it is not possible to simulate an assignment on that variable in the caller's scope. However since the function has access to the same object as the caller (no copy is made), mutations to those objects, if the objects are mutable, within the function are visible to the caller, which may appear to differ from call by value semantics. For immutable objects, there is no real difference between call by sharing and call by value, except for the object identity. The use of call by sharing with mutable objects is an alternative to input/output parameters:[7] the parameter is not assigned to (the argument is not overwritten and object identity is not changed), but the object (argument) is mutated.

For example in Python, lists are mutable, so:

def f(l):
    l.append(1)
m = []
f(m)
print m

...yields [1] because the argument l is mutated.

An important subtlety is the distinction between mutation and assignment. In Python the code:

def f(l):
    l += [1]
m = []
f(m)
print m

...yields [1] because the statement l += [1] acts as l.extend([1]), but the similar code:

def f(l):
    l = l + [1]
m = []
f(m)
print m

...yields [] because the statement l = l + [1] creates a new local variable, rather than mutating the argument.[lower-alpha 1]

Although this term has widespread usage in the Python community, identical semantics in other languages such as Java and Visual Basic are often described as call by value, where the value is implied to be a reference to the object.

Call by copy-restore

Call-by-copy-restore, copy-in copy-out, call-by-value-result or call-by-value-return (as termed in the Fortran community) is a special case of call-by-reference where the provided reference is unique to the caller. This variant has gained attention in multiprocessing contexts and Remote procedure call: if a parameter to a function call is a reference that might be accessible by another thread of execution, its contents may be copied to a new reference that is not; when the function call returns, the updated contents of this new reference are copied back to the original reference ("restored").

The semantics of call-by-copy-restore also differ from those of call-by-reference where two or more function arguments alias one another; that is, point to the same variable in the caller's environment. Under call-by-reference, writing to one will affect the other; call-by-copy-restore avoids this by giving the function distinct copies, but leaves the result in the caller's environment undefined depending on which of the aliased arguments is copied back first - will the copies be made in left-to-right order both on entry and on return?

When the reference is passed to the callee uninitialized, this evaluation strategy may be called call-by-result.

Partial evaluation

In partial evaluation, evaluation may continue into the body of a function that has not been applied. Any sub-expressions that do not contain unbound variables are evaluated, and function applications whose argument values are known may be reduced. In the presence of side-effects, complete partial evaluation may produce unintended results; for this reason, systems that support partial evaluation tend to do so only for "pure" expressions (expressions without side-effects) within functions.

Non-strict evaluation

In non-strict evaluation, arguments to a function are not evaluated unless they are actually used in the evaluation of the function body.

Under Church encoding, lazy evaluation of operators maps to non-strict evaluation of functions; for this reason, non-strict evaluation is often referred to as "lazy". Boolean expressions in many languages use a form of non-strict evaluation called short-circuit evaluation, where evaluation returns as soon as it can be determined that an unambiguous Boolean will result — for example, in a disjunctive expression where true is encountered, or in a conjunctive expression where false is encountered, and so forth. Conditional expressions also usually use lazy evaluation, where evaluation returns as soon as an unambiguous branch will result.

Normal order

Normal-order (or leftmost outermost) evaluation is the evaluation strategy where the outermost redex is always reduced, applying functions before evaluating function arguments.

In contrast, a call-by-name strategy does not evaluate inside the body of an unapplied function.

Call by name

In call-by-name evaluation, the arguments to a function are not evaluated before the function is called — rather, they are substituted directly into the function body (using capture-avoiding substitution) and then left to be evaluated whenever they appear in the function. If an argument is not used in the function body, the argument is never evaluated; if it is used several times, it is re-evaluated each time it appears. (See Jensen's Device.)

Call-by-name evaluation is occasionally preferable to call-by-value evaluation. If a function's argument is not used in the function, call-by-name will save time by not evaluating the argument, whereas call-by-value will evaluate it regardless. If the argument is a non-terminating computation, the advantage is enormous. However, when the function argument is used, call-by-name is often slower, requiring a mechanism such as a thunk.

An early use was ALGOL 60. .NET languages can simulate call-by-name using delegates or Expression<T> parameters. The latter results in an abstract syntax tree being given to the function. Eiffel provides agents, which represents an operation to be evaluated when needed. Seed7 provides call-by-name with function parameters.

Call by need

Call-by-need is a memoized version of call-by-name where, if the function argument is evaluated, that value is stored for subsequent uses. In a "pure" (effect-free) setting, this produces the same results as call-by-name; when the function argument is used two or more times, call-by-need is almost always faster.

Because evaluation of expressions may happen arbitrarily far into a computation, languages using call-by-need generally do not support computational effects (such as mutation) except through the use of monads and uniqueness types. This eliminates any unexpected behavior from variables whose values change prior to their delayed evaluation.

Lazy evaluation is the most commonly used implementation strategy for call-by-need semantics, but variations exist — for instance optimistic evaluation.

Haskell is the best-known language that uses call-by-need evaluation. R also uses a form of call-by-need. .NET languages can simulate call-by-need using the type Lazy<T>.

Call by macro expansion

Call-by-macro-expansion is similar to call-by-name, but uses textual substitution rather than capture-avoiding substitution. With uncautious use, macro substitution may result in variable capture and lead to undesired behavior. Hygienic macros avoid this problem by checking for and replacing shadowed variables that are not parameters.

Nondeterministic strategies

Full β-reduction

Under full β-reduction, any function application may be reduced (substituting the function's argument into the function using capture-avoiding substitution) at any time. This may be done even within the body of an unapplied function.

Call by future

Call-by-future (or parallel call-by-name) is a concurrent evaluation strategy: the value of a future expression is computed concurrently with the flow of the rest of the program. When the value of the future is needed, the main program blocks until the future finishes computing, if it has not already completed by then.

This strategy is non-deterministic, as the evaluation can occur at any time between when the future is created (when the expression is given) and when the value of the future is used. It is similar to call-by-need in that the value is only computed once, and computation may be deferred until the value is needed, but it may be started before. Further, if the value of a future is not needed, such as if it is a local variable in a function that returns, the computation may be terminated part-way through.

If implemented with processes or threads, creating a future will spawn a new process or thread, accessing the value will synchronize this with the main thread, and terminating the computation of the future corresponds to killing the thread computing its value.

Optimistic evaluation

Optimistic evaluation is another variant of call-by-need in which the function's argument is partially evaluated for some amount of time (which may be adjusted at runtime), after which evaluation is aborted and the function is applied using call-by-need. This approach avoids some of the runtime expense of call-by-need, while still retaining the desired termination characteristics.

See also

Notes

  1. As a matter of Python syntax, note that l += x is not equivalent to l = l + x – semantically it is a mutator, not an assignment. Further, l += x is not literally equivalent to l.extend(x) either, due to scoping issues: l += x requires that l be in the local scope, while l.extend(x) looks to enclosing scopes too.

References

  1. Essentials of Programming Languages by Daniel P. Friedman and Mitchell Wand, MIT Press 1989--2006
  2. "Lambda Calculus". Cs.uiowa.edu. Retrieved 2013-08-18. 
  3. "applicative order reduction definition of applicative order reduction in the Free Online Encyclopedia". Encyclopedia2.thefreedictionary.com. Retrieved 2013-08-18. 
  4. Liskov, Barbara; Atkinson, Russ; Bloom, Toby; Moss, Eliot; Schaffert, Craig; Scheifler, Craig; Snyder, Alan (October 1979). "CLU Reference Manual" (PDF). Laboratory for Computer Science. Massachusetts Institute of Technology. Retrieved 2011-05-19. 
  5. Lundh, Fredrik. "Call By Object". effbot.org. Retrieved 2011-05-19. 
  6. "Iota Language Definition". CS 412/413 Introduction to Compilers. Cornell University. 2001. Retrieved 2011-05-19. 
  7. CA1021: Avoid out parameters

This article is issued from Wikipedia. The text is available under the Creative Commons Attribution/Share Alike; additional terms may apply for the media files.