Control flow
From Wikipedia, the free encyclopedia
In computer science control flow (or alternatively, flow of control) refers to the order in which the individual statements, instructions or function calls of an imperative or functional program are executed or evaluated. Within an imperative programming language, a control flow statement is an instruction that when executed can cause a change in the subsequent control flow to differ from the natural sequential order in which the instructions are listed. For non-strict functional languages, functions and language constructs exist to achieve the same ends, but they are not necessarily called control flow statements.
The kinds of control flow statements available differ by language, but can be roughly categorized by their effect:
- continuation at a different statement (jump),
- executing a set of statements only if some condition is met (choice),
- executing a set of statements zero or more times, until some condition is met (loop),
- executing a set of distant statements, after which the flow of control may possibly return (subroutines, coroutines, and continuations),
- stopping the program, preventing any further execution (halt).
Interrupts and signals are low-level mechanisms that can alter the flow of control in a way similar to a subroutine, but are usually in response to some external stimulus or event rather than a control flow statement in the language. Self-modifying code can also be used to affect control flow through its side effects, but usually does not involve an explicit control flow statement (an exception being the ALTER verb in COBOL[citation needed]).
At the level of machine or assembly language, control flow instructions usually work by altering the program counter. For some CPUs the only control flow instructions available are conditional or unconditional branches (sometimes called jumps). Some processor designs also complicate the control flow by wavering from a strict sequential ordering of instructions, with features such as speculative execution, out-of-order execution, and branch delay slots. Compilers for higher-level programming languages must therefore translate all the many control-flow statements of the language into equivalent code using only the more limited instructions, and in a manner which preserves an observable behavior of a natural sequential flow - assuming that only one thread is executing. Discussion of control flow is almost always restricted to a single thread of execution, as it depends upon a definite sequence in which instructions are executed one at a time.
Contents |
[edit] Primitives
[edit] Labels
A label is an explicit name or number assigned to a fixed position within the source code, and which may be referenced by control flow statements appearing elsewhere in the source code. Other than marking a position within the source code a label has no effect.
Line numbers are a kind of label used in some languages (e.g. Fortran and BASIC), which are whole numbers placed at the beginning of each line of text within the source code. Languages which use line numbers often impose the constraint that the line numbers must increase in value in each subsequent line, but may not require that they be consecutive. For example, in BASIC:
10 LET X = 3 20 PRINT X
In other languages such as C and Ada a label is an identifier, usually appearing at the beginning of a line and immediately followed by a colon. For example, in C:
Success: printf ("The operation was successful.\n");
The Algol 60 language allowed both whole numbers and identifiers as labels (both attached by colons to the following statement), but few if any other variants of Algol allowed whole numbers.
[edit] Goto
The goto statement (a combination of the English words go and to, and pronounced accordingly) is the most basic form of unconditional transfer of control.
Although the keyword may either be in upper or lower case depending on the language, it is usually written as:
goto label
The effect of a goto statement is to cause the next statement to be executed to always be the statement appearing immediately after (or at) the indicated label.
Goto statements have been considered harmful by many computer scientists, notably Dijkstra.
[edit] Subroutines
The terminology for subroutines varies; they may alternatively be known as routines, procedures, functions (especially if they return results) or methods (especially if they belong to classes or type classes).
In the 1950's, computer memories were very small by current standards so subroutines were used primarily[citation needed] to reduce program size; a piece of code was written once and then used many times from various other places in the program.
Nowadays, subroutines are more frequently used to help make a program more structured, e.g. by isolating some particular algorithm or hiding some particular data access method. If many programmers are working on a single program, subroutines are one kind of modularity that can help split up the work.
[edit] Minimal structured control flow
(See also Structured program theorem.) In May 1966, Böhm and Jacopini published an article in Communications of the ACM which showed that any program with gotos could be transformed into a goto-free form involving only choice (IF THEN ELSE) and loops (WHILE condition DO xxx), possibly with duplicated code and/or the addition of Boolean variables (true/false flags). Later authors have shown that choice can be replaced by loops (and yet more Boolean variables).
The fact that such minimalism is possible does not necessarily mean that it is desirable; after all, computers theoretically only need one machine instruction (subtract one number from another and branch if the result is negative), but practical computers have dozens or even hundreds of machine instructions.
What Böhm and Jacopini's article showed was that all programs could be goto-free. Other research showed that control structures with one entry and one exit were much easier to understand than any other form, primarily because they could be used anywhere as a statement without disrupting the control flow. In other words, they were composable. (Later developments, such as non-strict programming languages - and more recently, composable software transactions - have continued this line of thought, making components of programs even more freely composable.)
[edit] Control structures in practice
Most programming languages with control structures have an initial keyword which indicates the type of control structure involved. Languages then divide as to whether or not control structures have a final keyword.
- No final keyword: Algol 60, C, C++, Haskell, Java, Pascal, Perl, PHP, PL/I, Python, PowerShell. Such languages need some way of grouping statements together:
- Algol 60 and Pascal :
begin
...end
- C, C++, Java, Perl, PHP, and PowerShell: curly brackets
{
...}
- PL/1:
DO
...END
- Python: uses indentation level (see Off-side rule)
- Haskell: either indentation level or curly brackets can be used, and they can be freely mixed
- Algol 60 and Pascal :
- Final keyword: Ada, Algol 68, Modula-2, Fortran 77, Visual Basic. The forms of the final keyword vary:
- Ada: final keyword is
end
+ space + initial keyword e.g.if
...end if
,loop
...end loop
- Algol 68: initial keyword spelled backwards e.g.
if
...fi
,case
...esac
- Fortran 77: final keyword is
end
+ initial keyword e.g.IF
...ENDIF
,DO
...ENDDO
- Modula-2: same final keyword
end
for everything - Visual Basic: every control structure has its own keyword.
If
...End If
;For
...Next
;Do
...Loop
- Ada: final keyword is
[edit] Choice
[edit] Loops
A loop is a sequence of statements which is specified once but which may be carried out several times in succession. The code "inside" the loop (the body of the loop, shown below as xxx) is obeyed a specified number of times, or once for each of a collection of items, or until some condition is met.
In some languages, such as Scheme, loops are often expressed using tail recursion rather than explicit looping constructs.
[edit] Count-controlled loops
Most programming languages have constructions for repeating a loop a certain number of times. Note that if N is less than 1 in these examples then the language may specify that the body is skipped completely, or that the body is executed just once with N = 1. In most cases counting can go downwards instead of upwards and step sizes other than 1 can be used.
FOR I = 1 TO N for I := 1 to N do begin xxx xxx NEXT I end; DO I = 1,N for ( I=1; I<=N; ++I ) { xxx xxx END DO }
- See also: Loop counter
In many programming languages, only integers can be reliably used in a count-controlled loop. Floating-point numbers are represented imprecisely due to hardware constraints, so a loop such as
for X := 0.1 step 0.1 to 1.0 do
might be repeated 9 or 10 times, depending on rounding errors and/or the hardware and/or the compiler version. Furthermore, if the increment of X occurs by repeated addition, accumulated rounding errors may mean that the value of X in each iteration can differ quite significantly from the expected sequence 0.1, 0.2, 0.3, ..., 1.0.
[edit] Condition-controlled loops
Again, most programming languages have constructions for repeating a loop until some condition changes. Note that some variations place the test at the start of the loop, while others have the test at the end of the loop. In the former case the body may be skipped completely, while in the latter case the body is always obeyed at least once.
DO WHILE (test) repeat xxx xxx LOOP until test; while (test) { do xxx xxx } while (test);
See also Do while loop.
[edit] Collection-controlled loops
A few programming languages (e.g. Ada, Smalltalk, Perl, Java, C#, Visual Basic) have special constructs which allow implicitly looping through all elements of an array, or all members of a set or collection.
someCollection do: [:eachElement |xxx]. foreach someArray { xxx } Collection<String> coll; for (String s : coll) {} foreach (string s in myStringCollection) { xxx } $someCollection | ForEach-Object { $_ }
[edit] General iteration
General iteration constructs such as C's for statement and Common Lisp's do form can be used to express any of the above sorts of loops, as well as others -- such as looping over a number of collections in parallel. Where a more specific looping construct can be used, it is usually preferred over the general iteration construct, since it often makes the purpose of the expression more clear.
[edit] Infinite loops
Sometimes it is desirable for a program to loop forever, or until an exceptional condition such as an error arises. For instance, an event-driven program may be intended to loop forever handling events as they occur, only stopping when the process is killed by the operator.
More often, an infinite loop is due to a programming error in a condition-controlled loop, wherein the loop condition is never changed within the loop.
[edit] Continuation with next iteration
Sometimes within the body of a loop there is a desire to skip the remainder of the loop body and continue with the next iteration of the loop. Some languages provide a statement such as continue or skip which will do this. The effect is to prematurely terminate the innermost loop body and then resume as normal with the next iteration. If the iteration is the last one in the loop, the effect is to terminate the entire loop early.
[edit] Early exit from loops
When using a count-controlled loop to search through a table, it might be desirable to stop searching as soon as the required item is found. Some programming languages provide a statement such as break or exit, whose effect is to terminate the current loop immediately and transfer control to the statement immediately following that loop. Things can get a bit messy if searching a multi-dimensional table using nested loops (see Missing Control Structures below).
The following example is done in Ada which supports both early exit from loops and loops with test in the middle. Both features are very similar and comparing both code snippets will show the difference: early exit needs to be combined with an if statement while a condition in the middle is a self contained construct.
with Ada.Text IO; with Ada.Integer Text IO; procedure Print_Squares is X : Integer; begin Read_Data : loop Ada.Integer Text IO.Get(X); exit Read_Data when X = 0; Ada.Text IO.Put (X * X); Ada.Text IO.New_Line; end loop Read_Data; end Print_Squares;
Python supports conditional execution of code depending on whether a loop was exited early (with a break
statement) or not by using a else-clause with the loop. For example,
for n in set_of_numbers: if isprime(n): print "Set contains a prime number" break else: print "Set did not contain any prime numbers"
Note that the else
clause in the above example is attached to the for
statement, and not the inner if
statement. Both Python's for
and while
loops support such an else clause, which is executed only if early exit of the loop did not occur.
[edit] Loop system cross reference table
Programming language | conditional | loop | early exit | continuation | |||||
---|---|---|---|---|---|---|---|---|---|
begin | middle | end | count | collection | general | infinite [1] | |||
Ada | Yes | Yes | Yes | Yes | arrays | No | Yes | deep nested | No |
C | Yes | No | Yes | No [2] | No | Yes | No | deep nested [3] | Yes |
C++ | Yes | No | Yes | No [2] | No | Yes | No | deep nested [3] | Yes |
C# | Yes | No | Yes | No [2] | Yes | Yes | No | deep nested [3] | Yes |
FORTRAN 77 | Yes | No | No | Yes | No | No | No | one level | Yes |
Fortran 90 | Yes | No | No | Yes | No | No | Yes | deep nested | Yes |
Java | Yes | No | Yes | No [2] | Yes | Yes | No | deep nested | Yes |
PHP | Yes | No | Yes | No [2] | Yes [4] | Yes | No | deep nested | Yes |
Python | Yes | No | No | No [5] | Yes | No | No | deep nested [6] | Yes |
Visual Basic .NET | Yes | No | Yes | Yes | Yes | No | Yes | one level | Yes |
Windows PowerShell | Yes | No | Yes | No [2] | Yes | Yes | No | ? | Yes |
- a
while (true)
does not count as an infinite loop for this purpose, because it is not a dedicated language structure. - a b c d C's
for (init; condition; loop)
loop is a general loop construct, not specifically a counting one, although it is often used for that. - a b Deep breaks may be accomplished in C, C++ and C# through the use of labels and gotos.
- a Iteration over objects was added in PHP 5.
- a A counting loop can be simulated by iterating over an incrementing list or generator, like
range
orxrange
. - a Deep breaks may be accomplished in Python through the use of exception handling.
[edit] Structured non-local control flow
Many programming languages, particularly those which favor more dynamic styles of programming, offer constructs for non-local control flow. These cause the flow of execution to jump out of a given context and resume at some predeclared point. Exceptions, conditions, and continuations are three common sorts of non-local control constructs.
[edit] Conditions
PL/I has some 22 standard conditions (e.g. ZERODIVIDE SUBSCRIPTRANGE ENDFILE) which can be RAISEd and which can be intercepted by: ON condition action; Programmers can also define and use their own named conditions.
Like the unstructured if only one statement can be specified so in many cases a GOTO is needed to decide where flow of control should resume.
Unfortunately, some implementations had a substantial overhead in both space and time (especially SUBSCRIPTRANGE), so many programmers tried to avoid using conditions.
Common Syntax examples:
ON condition GOTO label
[edit] Exceptions
Modern languages have a structured construct for exception handling which does not rely on the use of GOTO:
try { xxx1 // Somewhere in here xxx2 // use: '''throw''' someValue; xxx3 } catch (someClass & someId) { // catch value of someClass actionForSomeClass } catch (someType & anotherId) { // catch value of someType actionForSomeType } catch (...) { // catch anything not already caught actionForAnythingElse }
Any number and variety of catch clauses can be used above. In D, Java, C#, and Python a finally clause can be added to the try construct. No matter how control leaves the try the code inside the finally clause is guaranteed to execute. This is useful when writing code that must relinquish an expensive resource (such as an opened file or a database connection) when finished processing:
FileStream stm = null; // C# example try { stm = new FileStream ("logfile.txt", FileMode. Create); return ProcessStuff(stm); // may throw an exception } finally { if (stm != null) stm. Close(); }
Since this pattern is fairly common, C# has a special syntax:
using (FileStream stm = new FileStream ("logfile.txt", FileMode. Create)) { return ProcessStuff(stm); // may throw an exception }
Upon leaving the using-block, the compiler guarantees that the stm object is released.
All these languages define standard exceptions and the circumstances under which they are thrown. Users can throw exceptions of their own (in fact C++ and Python allow users to throw and catch almost any type).
If there is no catch matching a particular throw, then control percolates back through subroutine calls and/or nested blocks until a matching catch is found or until the end of the main program is reached, at which point the program is forcibly stopped with a suitable error message.
The AppleScript scripting programming language provides several pieces of information to a "try" block:
try set myNumber to myNumber / 0 on error e number n from f to t partial result pr if ( e = "Can't divide by zero" ) then display dialog "You idiot!" end try
[edit] Continuations
[edit] Non-local control flow cross reference
Programming language | conditions | exceptions |
---|---|---|
Ada | No | Yes |
C | No | No |
C++ | No | Yes |
C# | No | Yes |
D | No | Yes |
Haskell | No | Yes |
Java | No | Yes |
Objective C | No | Yes |
PHP | No | Yes |
PL/1 | Yes | No |
Python | No | Yes |
Ruby | No | Yes |
Visual Basic .NET | Yes | Yes |
Windows PowerShell | No | Yes |
[edit] Proposed control structures
In a spoof Datamation article (December 1973), R. Lawrence Clark suggested that the GOTO statement could be replaced by the COMEFROM statement, and provides some entertaining examples. This was actually implemented in the INTERCAL programming language, a language designed to make programs as obscure as possible.
In his 1974 article "Structured Programming with go to Statements", Donald Knuth identified two situations which were not covered by the control structures listed above, and gave examples of control structures which could handle these situations. Despite their utility, these constructions have not yet found their way into main-stream programming languages.
[edit] Loop with test in the middle
This was proposed by Dahl in 1972.
loop loop xxx1 read(char); while test; while not atEndOfFile; xxx2 write(char); repeat; repeat;
If xxx1 is omitted we get a loop with the test at the top. If xxx2 is omitted we get a loop with the test at the bottom. If while is omitted we get an infinite loop. Hence this single construction can replace several constructions in most programming languages. A possible variant is to allow more than one while test; within the loop, but the use of exitwhen (see next section) appears to cover this case better.
As the example on the right shows (copying a file one character at a time), there are simple situations where this is exactly the right construction to use in order to avoid duplicated code and/or repeated tests.
In Ada, the above loop construct (loop-while-repeat) can be represented using a standard infinite loop (loop - end loop) that has an exit when clause in the middle (not to be confused with the exitwhen statement in the following section).
with Ada.Text_IO; with Ada.Integer_Text_IO; procedure Print_Squares is X : Integer; begin Read_Data : loop Ada.Integer_Text_IO.Get(X); exit Read_Data when X = 0; Ada.Text IO.Put (X * X); Ada.Text IO.New_Line; end loop Read_Data; end Print_Squares;
Naming a loop (Like Read_Data in our example) is optional but allows to leave the outer loop of several nested loops.
[edit] Multiple early exit/exit from nested loops
This was proposed by Zahn in 1974. A modified version is presented here.
exitwhen EventA or EventB or EventC; xxx exits EventA: actionA EventB: actionB EventC: actionC endexit;
exitwhen is used to specify the events which may occur within xxx, their occurrence is indicated by using the name of the event as a statement. When some event does occur, the relevant action is carried out, and then control passes just after endexit. This construction provides a very clear separation between determining that some situation applies, and the action to be taken for that situation.
exitwhen is conceptually similar to the try/catch construct in C++, but is likely to be much more efficient since there is no percolation across subroutine calls and no transfer of arbitrary values. Also, the compiler can check that all specified events do actually occur and have associated actions.
The following simple example involves searching a two-dimensional table for a particular item.
exitwhen found or missing; for I := 1 to N do for J := 1 to M do if table[I,J] = target then found; missing; exits found: print ("item is in table"); missing: print ("item is not in table"); endexit;
[edit] See also
- Branch (computer science)
- Control flow graph
- Coroutine
- Flowchart
- GOTO
- Main loop
- Recursion
- Spaghetti code
- Structured programming
- Subroutine
[edit] References
- Dahl & Dijkstra & Hoare, "Structured Programming" Academic Press, 1972.
- Knuth, Donald E. "Structured Programming with go to Statements" ACM Computing Surveys 6(4):261-301, December 1974.
- Böhm, Jacopini. Flow diagrams, "Turing Machines and Languages with only Two Formation Rules" Comm. ACM, 9(5):366-371, May 1966.
- Hoare, C. A. R. "Partition: Algorithm 63," "Quicksort: Algorithm 64," and "Find: Algorithm 65." Comm. ACM 4, 321-322, 1961.
- Zahn, C. T. "A control statement for natural top-down structured programming" presented at Symposium on Programming Languages, Paris, 1974.