In computer science, a call stack is a stack data structure that stores information about the active subroutines of a computer program. This kind of stack is also known as an execution stack, control stack, run-time stack, or machine stack, and is often shortened to just "the stack". Although maintenance of the call stack is important for the proper functioning of most software, the details are normally hidden and automatic in high-level programming languages.
A call stack is used for several related purposes, but the main reason for having one is to keep track of the point to which each active subroutine should return control when it finishes executing. An active subroutine is one that has been called but is yet to complete execution after which control should be handed back to the point of call. Such activations of subroutines may be nested to any level (recursive as a special case), hence the stack structure. If, for example, a subroutine DrawSquare
calls a subroutine DrawLine
from four different places, DrawLine
must know where to return when its execution completes. To accomplish this, the address following the call instruction, the return address, is pushed onto the call stack with each call.
Contents |
Since the call stack is organized as a stack, the caller pushes the return address onto the stack, and the called subroutine, when it finishes, pops the return address off the call stack and transfers control to that address. If a called subroutine calls on to yet another subroutine, it will push another return address onto the call stack, and so on, with the information stacking up and unstacking as the program dictates. If the pushing consumes all of the space allocated for the call stack, an error called a stack overflow occurs, generally causing the program to crash. Adding a subroutine's entry to the call stack is sometimes called winding; conversely, removing entries is unwinding.
There is usually exactly one call stack associated with a running program (or more accurately, with each task or thread of a process), although additional stacks may be created for signal handling or cooperative multitasking (as with setcontext). Since there is only one in this important context, it can be referred to as the stack (implicitly, "of the task").
In high-level programming languages, the specifics of the call stack are usually hidden from the programmer. They are given access only to a set of functions, and not the memory on the stack itself. Most assembly languages, on the other hand, require programmers to be involved with manipulating the stack. The actual details of the stack in a programming language depend upon the compiler, operating system, and the available instruction set.
As noted above, the primary purpose of a call stack is:
A call stack may serve additional functions, depending on the language, operating system, and machine environment. Among them can be:
The typical call stack is used for the return address, locals, and parameters (known as a call frame). In some environments there may be more or fewer functions assigned to the call stack. In the Forth programming language, for example, only the return address and local variables are stored on the call stack (which in that environment is named the return stack); parameters are stored on a separate data stack. Most Forths also have a third stack for floating point parameters.
A call stack is composed of stack frames (also called activation records or activation frames). These are machine dependent data structures containing subroutine state information. Each stack frame corresponds to a call to a subroutine which has not yet terminated with a return. For example, if a subroutine named DrawLine
is currently running, having been called by a subroutine DrawSquare
, the top part of the call stack might be laid out like this (where the stack is growing towards the top):
The stack frame at the top of the stack is for the currently executing routine. The stack frame usually includes at least the following items:
DrawLine
stack frame, an address into DrawSquare
's code)The data stored in the stack frame may sometimes be accessed directly via the stack pointer register (which indicates the current top of the stack). However, as the stack pointer is variable during the activation of the routine, memory locations within the stack frame are more typically accessed via a separate register which makes relative addressing simpler and also enables dynamic allocation mechanisms (see below). This register is often termed the frame pointer and is set up at procedure entry to point to a fixed location in the frame structure (such as the return address).
As different routines have different parameters and local data, stack frames have various sizes. Although they may often be fixed across all activations of a particular routine, many modern languages also support dynamic allocations on the stack, which means that the local data area will vary from activation to activation with a size that must be unspecified when the program is compiled. In this case, access via a frame pointer, rather than via the stack pointer, is usually necessary since the offsets from the stack top to values such as the return address would not be known at compile time.
In most systems a stack frame has a field to contain the previous value of the frame pointer register, the value it had while the caller was executing. For example, in the diagram above, the stack frame of DrawLine
would have a memory location holding the frame pointer value that DrawSquare
uses. The value is saved upon entry to the subroutine and restored for the return. Having such a field in a known location in the stack frame enables code to access each frame successively underneath the currently executing routine's frame, and also allows the routine to easily restore the frame pointer to the caller's frame, just before it returns.
Programming languages that support nested subroutines also have a field in the call frame that points to the stack frame of the latest activation of the procedure that most closely encapsulates the callee, i.e. the immediate scope of the callee. This is called an access link or static link (as it keeps track of static nesting during dynamic and recursive calls) and provides the routine (as well as any other routines it may invoke) access to the local data of its encapsulating routines at every nesting level. Some architectures, compilers, or optimization cases store one link for each enclosing level (not just the immediately enclosing), so that deeply nested routines that access shallow data do not have to traverse several links; this strategy is often called a display[1]. Access link(s) can be optimized away in cases where an inner function does not access any (non constant) local data in the encapsulation—pure functions, i.e routines communicating via argument(s) and return value(s) only, would be an example of this. Some historical computers, such as the Burroughs large systems, had special "display registers" to support nested functions while compilers for most modern machines (such as the ubiquitous x86) simply reserve a few words on the stack for the pointers, as needed.
For some purposes, the stack frame of a subroutine and that of its caller can be considered to overlap, the overlap consisting of the area where the parameters are passed from the caller to the callee. In some environments, the caller pushes each argument onto the stack, thus extending its stack frame, then invokes the callee. In other environments, the caller has a preallocated area at the top of its stack frame to hold the arguments it supplies to other subroutines it calls. This area is sometimes termed the outgoing arguments area or callout area. Under this approach, the size of the area is calculated by the compiler to be the largest needed by any called subroutine.
Usually the call stack manipulation needed at the site of a call to a subroutine is minimal (which is good since there can be many call sites for each subroutine to be called). The values for the actual arguments are evaluated at the call site, since they are specific to the particular call, and either pushed onto the stack or placed into registers, as determined by the calling convention being used. The actual call instruction, such as "Branch and Link," is then typically executed to transfer control to the code of the target subroutine.
In the called subroutine, the first code executed is usually termed the subroutine prologue, since it does the necessary housekeeping before the code for the statements of the routine is begun.
The prologue will commonly save the return address left in a register by the call instruction by pushing the value onto the call stack. Similarly, the current stack pointer and/or frame pointer values may be pushed. Alternatively, some instruction set architectures automatically provide comparable functionality as part of the action of the call instruction itself, and in such an environment the prologue need not do this.
If frame pointers are being used, the prologue will typically set the new value of the frame pointer register from the stack pointer. Space on the stack for local variables can then be allocated by incrementally changing the stack pointer.
The Forth programming language allows explicit winding of the call stack (called there the "return stack").
When a subroutine is ready to return, it executes an epilogue that undoes the steps of the prologue. This will typically restore saved register values (such as the frame pointer value) from the stack frame, pop the entire stack frame off the stack by changing the stack pointer value, and finally branch to the instruction at the return address. Under many calling conventions the items popped off the stack by the epilogue include the original argument values, in which case there usually are no further stack manipulations that need to be done by the caller. With some calling conventions, however, it is the caller's responsibility to remove the arguments from the stack after the return.
Returning from the called function will pop the top frame off of the stack, perhaps leaving a return value.
Some languages (such as Pascal) allow a global goto statement to transfer control out of a nested function and into a previously invoked outer function. This operation requires the stack to be unwound, removing as many stack frames as necessary to restore the proper context to transfer control to the target statement within the enclosing outer function. Such transfers of control are generally used only for error handling.
Other languages (such as Object Pascal) provide exception handling, which also requires unwinding of the stack. The stack frame of a function contains one or more entries specifying exception handlers. When an exception is thrown, the stack is unwound until an exception handler is found that is prepared to handle (catch) the exception. Common Lisp allows control of what happens when the stack is unwound by using the unwind-protect
special operator.
When applying a continuation, the stack is (logically) unwound and then rewound with the stack of the continuation. This is not the only way to implement continuations; for example, using multiple, explicit stacks, application of a continuation can simply activate its stack and wind a value to be passed. The Scheme programming language allows arbitrary thunks to be executed in specified points on "unwinding" or "rewinding" of the control stack when a continuation is invoked.
A reported technique [2] uses call stacks in a very different way than others discussed on this page. It uses call stacks for test suite reduction. Briefly, test suite reduction seeks to reduce the number of test cases in a test suite while retaining a high percentage of the original suite’s fault detection effectiveness. Two test cases are considered to be equivalent if they generate the same set of call stacks during execution. See [3] for more details.
Taking random-time samples of the call stack can be very useful in optimizing performance of programs. The reason is if a subroutine call instruction appears on the call stack for a certain fraction of execution time, its possible removal would save that much time. See Performance analysis and Deep sampling.
In a language with free pointers and/or non-checked array writes (such as C), the mixing of control flow data affecting the execution of code (return addresses, saved frame pointers) and simple program data (parameters, return values) in a call stack is a security risk, possibly exploitable through buffer overflows.