Nested function

In computer programming, a nested function (or nested procedure or subroutine) is a function which is defined within another function, the enclosing function. Due to scope rules, the nested function can thus access local variables of the enclosing function, and in turn the nested function cannot be accessed outside of the enclosing function. The nesting is theoretically possible to any ideas of depth, although only a few levels are normally used in practical programs.

Nested functions are used in many approaches to structured programming, including early ones, such as ALGOL, Simula and Pascal, and also in many modern dynamic languages and functional languages. However, they are typically not supported in the C-family of languages. This is largely due to significant implementation difficulties in implementing closures, which result from nested functions in many circumstances.

Effects

Nested functions have a number of effects, most basically due to scope; this assumes function scope or block scope. More simply and less importantly, the scope of the nested function is inside the enclosing function, which means that it is invisible outside of it. More importantly, the (body of the) nested function is in the scope of local variables of the enclosing function (if they have function scope), or of local variables in the top block of the function (if they have block scope), and likewise other local functions, constants, types, classes, etc.^{[lower-alpha 1]} This means it can access these entities, both for reading and writing, without explicit passing: it can access the enclosing environment, which greatly simplifies passing data into and out of the nested function.

Further, nested functions may allow closures to be created. If it is possible for the nested function (or a reference to it) to escape the enclosing function – for example if functions are first class objects and a nested function is passed (downwards) to another function or returned (upwards) from the enclosing function – then a closure is created and calls to this function can access the environment of the original function. Notably, the frame of the immediately enclosing function must continue to be alive until the last referencing closure dies. This significantly complicates implementation and code analysis, since non-local automatic variables referenced in closures cannot be stack allocated, and is a key reason nested functions are not implemented in some languages; this is known as the funarg problem. This becomes significantly more complicated if functions are nested to various levels, sharing different parts of their environment.

Examples

An example using Pascal syntax (with Algol, Modula2, Oberon, Ada, etc. similar):

function E(x: real): real;
    function F(y: real): real;
    begin
        F := x + y
    end;
begin
    E := F(3) + F(4)
end;

The function F is nested within E. Note that E's parameter x is visible also in F (as F is a part of E) while both x and y are invisible outside E and F respectively.

Similarly, in Standard ML:

fun e (x : real) =
  let
    fun f y = x+y
  in
    f 3 + f 4
  end;

One way to write the same example in Haskell syntax:

e :: Float -> Float
e x = f 3 + f 4 where f y = x + y

The same example in GNU C syntax (C extended with nested functions):

float E(float x)
{
    float F(float y)
    {
        return x + y;
    }
    return F(3) + F(4);
}

Quicksort

A more realistic example is this implementation of quicksort:^[1]

void sort(int *a, int size) {
    void quickSort(int first, int last) {
        void swap(int p, int q) {
 
            int tmp = a[p];
 
            a[p] = a[q];
 
            a[q] = tmp;
 
        }
 
        int partition() {
            int pivot = a[first], index = first;
            swap(index, last);
            for (int i = first; i < last; i++) if (a[i] < pivot) swap(index++, i);
            swap(index, last);
            return index;
        }
 
        if (first < last) {
            int pivotIndex = partition();
            quickSort(first, pivotIndex - 1);
            quickSort(pivotIndex + 1, last);
        }
    }
    quickSort(0, size - 1);
}

Purpose

Lexically nested function definitions are a form of information hiding and are useful for dividing procedural tasks into subtasks which are only meaningful locally. This avoids cluttering other parts of the program with functions and variables that are unrelated to those parts.

They are typically used as helper functions or as recursive functions inside another function (as in the quicksort example above). This has the structural benefit of organizing the code, avoids polluting the scope, and also allows functions to share state easily.^[2] As nested function can access local variables of the enclosing function, sharing of state is possible without passing parameters to the nested function or use a global variable, simplifying code.

In languages with nested functions, functions may normally also contain local constants, and types (in addition to local variables, parameters, and functions), encapsulated and hidden in the same nested manner, at any level of depth. This may further enhance the code structuring possibilities.

Other uses

Nested functions can also be used for unstructured control flow, by using the return statement for general unstructured control flow. This can be used for finer-grained control than is possible with other built-in features of the language – for example, it can allow early termination of a for loop if break is not available, or early termination of a nested for loop if a multi-level break or exceptions are not available.

Alternatives

The main alternative to nested functions in languages that lack support for them is to place all relevant functions and variables in a separate module (file) and expose only the top-level wrapper function publicly. In C this will generally be done by using static functions for encapsulation and static variables for communication.^[3] This achieves encapsulation and sharing of state, though not the logical organization given by lexical nesting of functions, and comes at the cost of having a separate file. It is also not possible in more than a single level.

Another alternative is to share state between the functions through function parameters, most often passing references as arguments to avoid the cost of copying. In C this is generally implemented by a pointer to a structure containing the context.^[3] This significantly increases the complexity of the function calls.^[2]

In PHP and another languages the anonymous function is the only alternative: the nested function is declared not as usual function, but by reference, as a local variable. To use local variables in the anonymous function, use closure.

Languages

Well known languages supporting lexically nested functions include:

ALGOL-based languages such as ALGOL 68, Simula, Pascal, Modula-2, Modula-3, Oberon, Seed7 and Ada.
Modern versions of Lisp (with lexical scope) such as Scheme, and Common Lisp.
ECMAScript (JavaScript, and ActionScript).
Scala (full support)
Various degrees of support in scripting languages such as Ruby, Python, PHP and Perl.
Standard C and C++ do not support nested functions, but:
- GCC supports nested functions in C, as a language extension.^[4]
- The D language, which is C-related, has nested functions.
Fortran, starting with Fortran-90, supports one level of nested (CONTAINed) subroutines and functions.
MATLAB (full support)
Wolfram Language

Functional languages

In most functional programming languages, such as Scheme, nested functions are a common way of implementing algorithms with loops in them. A simple (tail) recursive inner function is created, which behaves as the algorithm's main loop, while the outer function performs startup actions that only need to be done once. In more complex cases, a number of mutually recursive functions may be created as inner functions.

Some languages without direct support

Certain languages do not have straightforward syntactic and semantic support to implement nested functions. Nevertheless, for some of them the idea of nested functions can be simulated with some degree of difficulty through the use of other language constructs. The following languages can approximate nested functions through the respective strategies:

C++ allows definition of classes within classes, providing the ability to use class methods in a way similar to nested functions in one level (see Function object in C++). C++11 additionally supports nested lambda functions.

Eiffel explicitly disallows nesting of routines. This is to keep the language simple, and also allows the convention of using a special variable, Result, to denote the result of a (value-returning) function.

Visual Basic and C#, by using anonymous methods or lambda expressions.

Java, by using lambda expressions (see Anonymous functions in Java) (since Java 8) or through a workaround that consists in an anonymous class containing a single method. A named class declared local to a method may also be used.

Implementation

Notes

↑ In some circles, such as Python, this type of nested variables are sometimes known as non-local variables, although they in fact are local to the enclosing function, though not to the nested function.

References

↑ Re: Nesting functions- Why?, baavgai, 14 January 2012
↑ 2.0 2.1 Bright 2004.
↑ 3.0 3.1 3.2 "Question 20.24: Why doesn't C have nested functions?, comp.lang.c FAQ
↑ "Nested Functions - Using the GNU Compiler Collection (GCC)". GNU Project. Retrieved 2007-01-06.
↑ answer by Dave Vandervies, Aug 28 '09 at 17:45, to "Why are nested functions not supported by the C standard?"

Bright, Walter (1 May 2004). "Nested Functions". Dr. Dobb's.

External links

comp.lang.c FAQ: Nested Functions
"6.4 Nested procedure and functions". FreePascal documentation.