Scope (programming)

From Wikipedia, the free encyclopedia

In computer programming, scope is an enclosing context where values and expressions are associated. Various programming languages have various types of scopes. The type of scope determines what kind of entities it can contain and how it affects them -- or semantics. Scopes can:

A namespace is a scope that uses the enclosing nature of the scope to group logically related identifiers under a single identifier. Thus, scopes can affect the name resolution for their contents.

Variables are associated with scopes. Different scoping types affect how local variables are bound. This has different consequences depending if the language has static (lexical) or dynamic scoping.

Programmers often indent scopes in their source code text to improve readability.

Contents

[edit] History

Static scoping (also known as lexical scoping) was first introduced in Lisp 1.5 (via the FUNARG device developed by Steve Russell, working under John McCarthy) and added later into Algol 60 (also by Steve Russell), and has been picked up in other languages since then. Descendants of dynamically scoped languages often adopt static scoping. Emacs Lisp, for example, uses dynamic scoping, Common Lisp has both dynamic and static scoping, and Scheme uses static scoping exclusively. The original Lisp used dynamic scoping. In other cases, languages which already had dynamic scoping have added static scoping afterwards, such as Perl. C and Pascal have always had static scoping, since they are both influenced by the ideas that went into Algol.

[edit] Example

The following example shows various scopes declared in the language C#:

namespace N
{                        // namespace scope, merely groups identifiers
   class C
   {                     // class scope, defines/declares member variables and functions
      void f (bool b)
      {                  // function scope, contains executable statements
         if (b)
         {               // unnamed scope for conditionally executed statements
           ...
         }
      }
   }
}

[edit] Static versus dynamic scoping

One of the basic reasons for scoping is to keep variables in different parts of the program distinct from one another. Since there are only a small number of short variable names, and programmers share habits about the naming of variables (e.g., i for an array index), in any program of moderate size the same variable name will be used in multiple different scopes. The question of how to match various variable occurrences to the appropriate binding sites is generally answered in one of two ways: static scoping and dynamic scoping.

[edit] Static scoping (also known as lexical scoping)

With static scope, a variable always refers to its nearest enclosing binding. This is a property of the program text and unrelated to the runtime call stack. Because matching a variable to its binding only requires analysis of the program text, this type of scoping is sometimes also called lexical scoping. Static scope is standard in modern functional languages such as ML and Haskell because it allows the programmer to reason as if variable bindings are carried out by substitution. Static scoping also makes it much easier to make modular code and reason about it, since its binding structure can be understood in isolation. In contrast, dynamic scope forces the programmer to anticipate all possible dynamic contexts in which the module's code may be invoked.

For example, consider the following program fragment (in Pascal):

Program A;
Var I:Integer;
    K:Char;
    R:Real;
 
    Procedure B;
    Var K:Real;
        L:Integer;
 
        Procedure C;
        Var M:Real;
 
        Begin
        // (1)
        End;
 
    Begin
    // (2)
    End;
 
Begin
// (3)
end.

In the above code, the variable I is accessible as an Integer at points (1), (2) and (3) in the program because its scope is global, and is not overridden by another variable of the same name. The variable K is accessible as a Real at points (1) and (2) and as a character at (3). Also, because of the scope of K, the variable called K in B (at point (1)) and C (at point (2)) is not the same variable K in the main program at point (3). Variable L is accessible only in procedure C at point (1) and procedure B at point (2), and is not accessible from the main program. Variable M is only accessible in procedure C at point (1), and is not accessible either from Procedure B or the main program. Also, procedure C can only be called from Procedure B; it cannot be called from the main program. Also, there could be yet another procedure C declared later in the program, and a reference to that procedure would be dependent upon where in the program code as to which procedure is being called, same as to which variable is being referenced in the above example.

Correct implementation of static scope in languages with first-class nested functions can be subtle, as it requires each function value to carry with it a record of the values of the variables that it depends on (the pair of the function and this environment is called a closure). When first-class nested functions are not used or not available (such as in C), this overhead is of course not incurred. Variable lookup is always very efficient with static scope, as the location of each value is known at compile time.

[edit] Dynamic scoping

With dynamic scope, each identifier has a global stack of bindings. Introducing a local variable with name x pushes a binding onto the global x stack (which may have been empty), which is popped off when the control flow leaves the scope. Evaluating x in any context always yields the top binding. In other words, a global identifier refers to the identifier associated with the most recent environment. Note that this cannot be done at compile time because the binding stack only exists at runtime, which is why this type of scoping is called dynamic scoping.

Generally, certain blocks are defined to create bindings whose lifetime is the execution time of the block; this adds some features of static scoping to the dynamic scoping process. However, since a section of code can be called from many different locations and situations, it can be difficult to determine at the outset what bindings will apply when a variable is used (or if one exists at all). This can be beneficial; application of the principle of least knowledge suggests that code avoid depending on the reasons for (or circumstances of) a variable's value, but simply use the value according to the variable's definition. This narrow interpretation of shared data can provide a very flexible system for adapting the behavior of a function to the current state (or policy) of the system. However, this benefit relies on careful documentation of all variables used this way as well as on careful avoidance of assumptions about a variable's behavior, and does not provide any mechanism to detect interference between different parts of a program. As such, dynamic scoping can be dangerous and almost no modern languages use it. Some languages, like Perl and Common Lisp, allow the programmer to choose static or dynamic scoping when (re)defining a variable. Logo and Emacs lisp are some of the few languages that use dynamic scoping.

Dynamic scoping is easier to implement. To find an identifier's value, the program traverses the runtime stack, checking each activation record (each function's stack frame) for a value for the identifier. This is known as deep binding. An alternate strategy that is usually more efficient is to maintain a stack of bindings for each identifier; the stack is modified whenever the variable is bound or unbound, and a variable's value is simply that of the top binding on the stack. This is called shallow binding. Note that both of these strategies assume a last-in-first-out (LIFO) ordering to bindings for any one variable; in practice all bindings are so ordered.

[edit] Example

This example compares the consequences of using static scope and dynamic scope. Observe the following code, in a C-like language:

int x = 0;
int f () { return x; }
int g () { int x = 1; return f(); }

With static scoping, calling g will return 0 since it has been determined at compile time that the expression x in any invocation of f will yield the global x binding which is unaffected by the introduction of a local variable of the same name in g.

With dynamic scoping, the binding stack for the x identifier will contain two items when f is invoked from g: the global binding to 0, and the binding to 1 introduced in g (which is still present on the stack since the control flow hasn't left g yet). Since evaluating the identifier expression by definition always yields the top binding, the result is 1.

In the language Perl, variables can be defined with either static or dynamic scoping. Perl's keyword "my" defines a statically scoped local variable, while the keyword "local" defines dynamically scoped local variable[1]. This allows for further clarification with practical examples of each scoping model.

$x = 0;
sub f { return $x; }
sub g { my $x = 1; return f(); }
print g()."\n";

The example above uses "my" for static scoping of g's local variable $x. As above, calling g returns 0 because f cannot see g's variable $x, so it looks for the global $x.

$x = 0;
sub f { return $x; }
sub g { local $x = 1; return f(); }
print g()."\n";

In this alternative, "local" is used to make g's $x dynamically-scoped. Now, calling g yields 1 because f sees g's local variable by looking up the execution stack.

In other words, the dynamically-scoped variable $x is resolved in the environment of execution, rather than the environment of definition.

[edit] See also

[edit] References

  1. ^ Perl FAQ 4.3 What's the difference between dynamic and static (lexical) scoping?