For loop
From Wikipedia, the free encyclopedia
In computer science a for loop is a programming language statement which allows code to be repeatedly executed. A for loop is classified as an iteration statement.
Unlike many other kinds of loops, such as the while loop, the for loop is often distinguished by an explicit loop counter or loop variable. This allows the body of the for loop (the code that is being repeatedly executed) to know about the sequencing of each iteration. For loops are also typically used when the number of iterations is known before entering the loop.
The name for loop comes from the English word for, which is used as the keyword in most programming languages to introduce a for loop. In FORTRAN and PL/I though, the keyword DO is used and it is called a do loop, but it is otherwise identical to the for loop described here.
Contents |
[edit] Kinds of for loops
A for loop statement is available in most imperative programming languages. Even ignoring minor differences in syntax there are many differences in how these statements work and the level of expressiveness they support. Generally, for loops fall into one of the following categories:
[edit] Numeric ranges
This type of for loop is characterized by counting; enumerating each of the values within a numeric integer range, or arithmetic progression. The range is often specified by a beginning and ending number, and sometimes may include a step value (allowing one to count by two's or to count backwards for instance). A representative example in BASIC is:
FOR I = 1 TO 10 REM loop body NEXT I
The loop variable I will take on the values 1, 2, ..., 9, 10 through each of the ten iterations of the loop's body, which will be executed in that order. Each computer language and even different dialects of the same language (e.g. BASIC) has its own syntax keywords and means of specifying the start stop and step values.
A rarely-available generalisation allows the start stop and step values to be specified as items in a list, not just a single arithmetic sequence, for example
for i:=2,3,5,7,11 do etc
Where each item in the list might itself be an arithmetic sequence. An ordinary for loop is thus an example of a list with only one element.
[edit] Iterator-based for loops
This type of for loop is a generalization of the numeric range type of for loop; as it allows for the enumeration of sets of items other than number sequences. It is usually characterized by the use of an implicit or explicit iterator, in which the loop variable takes on each of the values in a sequence or other orderable data collection. A representative example in Python is:
for item in some_iterable_object: doSomething doSomethingElse
Where some_iterable_object is either a data collection that supports implicit iteration, or may in fact be an iterator itself. Some languages have this in addition to another for-loop syntax; notably, PHP has this type of loop under the name foreach
, as well as a three-expression for loop (see below) under the name for
.
[edit] Compound For loops
Introduced with ALGOL 68 and followed by PL/I, this allows the iteration of a loop to be compounded with a test, as in
for i:=1:N while A(i) > 0 do etc.
That is, a value is assigned to the loop variable i and only if the while expression is true will the loop body be executed. If the result were false the for-loop's execution stops short. Granted that the loop variable's value is defined after the termination of the loop, then the above statement will find the first non-positive element in array A (and if no such, its value will be N+1), or, with suitable variations, the first non-blank character in a string, and so on.
[edit] Three-expression for loops
This type of for loop is found in nearly all languages which share a common heritage with the C programming language. It is characterized by a three-parameter loop control expression; consisting of an initializer, a loop-test, and a counting expression. A representative example in C is:
for (counter = 0; counter < 10; counter++) //loop body
The three control expressions, separated by semicolons here, are from left to right the initializer expression, the loop test expression, and the counting expression. The initializer is evaluated exactly once right at the beginning. The loop test expression is evaluated at the beginning of each iteration through the loop, and determines when the loop should exit. Finally, the counting expression is evaluated at the end of each loop iteration, and is usually responsible for altering the loop variable.
In most languages which provide this type of for loop, each of the three control loop expressions is optional. When omitted the loop test expression is taken to always be true, while the initializer and counting expressions are treated as no-ops when omitted. The semicolons in the syntax are sufficient to indicate the omission of one of the expressions. Some examples include:
int i = 0; for ( ; i<10; ) { i++; }
or this,
int i = 0; for ( ; ; i++ ) { if ( i >= 10 ) break; }
Notice that in normal usage, the name of the iteration variable is repeated in each of the three parts. Using a different name in any part is valid syntax, though the resulting behaviour might not be desired.
[edit] Additional semantics and constructs
[edit] Use as infinite loops
This C-style for loop is commonly the source of an infinite loop since the fundamental steps of iteration are completely in the control of the programmer. In fact when infinite loops are intended, this type of for loop is often used (with empty expressions), such as:
for (;;) //loop body
[edit] Early exit and continuation
Some languages may also provide other supporting statements, which when present can alter how the for loop iteration proceeds. Common among these are the break and continue statements found in C and its derivatives. The break statement causes the inner-most loop to be terminated immediately when executed. The continue statement will move at once to the next iteration without further progress through the loop body for the current iteration. Other languages may have similar statements or otherwise provide means to alter the for loop progress; for example in Fortran 95:
DO I = 1,N statements !Executed for all values of ''I'', up to a disaster if any. IF (no good) CYCLE !Skip this value of ''I'', continue with the next. statements !Executed only where goodness prevails. IF (disaster) EXIT !Abandon the loop. statements !While good and, no disaster. END DO !Should align with the "DO".
This article or section may contain original research or unverified claims. Please improve the article by adding references. See the talk page for details. (October 2007) |
Unfortunately, the clear lead given by BASIC has not been followed and so the END DO does not identify the loop variable it is thought to belong with, and worse, nor do the CYCLE and EXIT statements. This becomes much more important when many and nested loops are involved. The EXIT could even be taken to mean exit the procedure instead of exit the loop: phrases such as NEXT I and possibly, DONEXT I and DOQUIT I or similar explicit indications would help make blunders more apparent. A partial solution is offered through a syntax extension (in Fortran and PL/I for example) whereby a label is associated with the DO statement and the same label text is appended to the end marker, such as x:DO I = 1,N and END DO x in the example, whereupon CYCLE x and EXIT x could be used. However, working directly against the objective, the text of the label x is not allowed to be the name of the loop variable, so "I" would not be allowed (though "II" would be), and, if there is a second such loop later in the routine, different label texts would have to be chosen.
[edit] Loop variable scope and semantics
Different languages specify different rules for what value the loop variable will hold on termination of its loop, and indeed some hold that it "becomes undefined". This permits a compiler to generate code that leaves any value in the loop variable, or perhaps even leaves it unchanged because the loop value was held in a register and never stored to memory.
In some languages (not C or C++) the loop variable is immutable within the scope of the loop body, with any attempt to modify its value being regarded as a semantic error. Such modifications are sometimes a consequence of a programmer error, which can be very difficult to identify once made. However only overt changes are likely to be detected by the compiler. Situations where the address of the loop variable is passed as an argument to a subroutine make it very difficult to check, because the routine's behaviour is in general unknowable to the compiler. Some examples in the style of Fortran:
DO I = 1,N I = 7 !Overt adjustment of the loop variable. Compiler complaint likely. Z = ADJUST(I) !Function ''ADJUST'' might alter ''I'', to uncertain effect. normal statements !Memory might fade that ''I'' is the loop variable. PRINT (A(I),B(I),I = 1,N,2) !Implicit for-loop to print odd elements of arrays A and B, reusing ''I''... PRINT I !What value will be presented? END DO !How many times will the loop be executed?
Still another possibility is that the code generated may employ an auxiliary variable as the loop variable, possibly held in a machine register, whose value may or may not be copied to I on each iteration. In this case, modifications of I would not affect the control of the loop, but now a disjunction is possible: within the loop, references to the value of I might be to the (possibly altered) current value of I or to the auxiliary variable (held safe from improper modification) and confusing results are guaranteed. For instance, within the loop a reference to element I of an array would likely employ the auxiliary variable (especially if it were held in a machine register), but if I is a parameter to some routine (for instance, a print-statement to reveal its value), it would likely be a reference to the proper variable I instead. It is best to avoid such possibilities.
[edit] Distinguishing iterator-based and numeric for loops
This article or section may contain original research or unverified claims. Please improve the article by adding references. See the talk page for details. (September 2007) |
A language may offer both an iterator-based for loop and a numeric-based for loop in of course their own syntax. It may appear that a numeric-based loop may always be replaced by an appropriate iterator-based loop, but differences can arise. Suppose an array A has elements 1 to N and the language does not offer a simple assignment statement A:=3*A; so that some sort of for-loop must be devised. Then in pseudocode, the two versions might be
for i:=1:N do A(i):=3*A(i); next i; forall i:=1:N do A(i):=3*A(i);
Which are clearly equivalent, indeed the differences seem mere syntax trivia, but, the semantics are different. The forall version is to be regarded as a mass assignment where notionally, all the right-hand side results are evaluated and left-hand side recipients are located, then the assignments are done. The code the compiler devises to effect this is unstated, but the idea is that no item on the left-hand side is changed before all items on the right-hand side are evaluated; every element is assigned to once only.
Now suppose that the requirement is that each element of A is to be the average of itself and its two neighbours, and for simplicity, suppose that the first and last elements are to be unchanged. Then, the loops become
for i:=2:N-1 do A(i):=[A(i-1) + A(i) + A(i+1)]/3; next i; forall i:=2:N-1 do A(i):=[A(i-1) + A(i) + A(i+1)]/3;
In this case, the results will be different: because the for loop is executed as successive iterations, element A(i-1) will hold the value that had just been calculated by the previous iteration, not the original value, while in the forall version, it would not yet have been changed. Provided of course that the compiler's implementation of the forall construct does in fact uphold that interpretation. This difference may or may not be important. If not, whichever runs fastest could be chosen. But the difference can be vital, as in certain stages of the LU decomposition algorithm for just one example.
More complex for loops (especially with conditional branching) will introduce further difficulties unlikely to be accommodated by the syntax of the forall statement. Rather than trying to convert a for statement into a forall statement, the task should be recast in terms of the assignment of one data structure (such as an array) to another of the same type, avoiding complex conditions. For instance, given integer division where 3/4 = 0, a square array A can be initialised to the identity matrix as follows:
forall i:=1:N,j:=1:N do A(i,j):=(i/j)*(j/i); %=1 if i=j, otherwise zero.
Here, the right-hand side is not an actual square array like A, but an expression generating a value for each element nonetheless, which could be regarded as forming a notional array. The forall statement is thus a generalisation of the simpler array-assignment statements, where both sides of the expression have to be actual arrays.
It might be that during the preparation of the right-hand side's values the compiler might deduce that the rather odd integer arithmetic expression generates either zero or one in a simple pattern and so would produce code that does not perform the evaluations, but this is unlikely. Better results would likely be gained by
A:=0; %Mass assignment to an array. forall i:=1:N do A(i,i):=1; %Too complex for simple array assignment.
Though a language designed to support array manipulation and especially matrix arithmetic might well allow some syntax that identifies the diagonal of a square array, so that the second statement might become
Diag(A):=1;
[edit] Equivalence with while loops
A for loop can be converted into an equivalent while loop by incrementing a counter variable directly. The following pseudocode illustrates this technique:
factorial := 1 for counter from 1 to 5: factorial := factorial * counter
is easily translated into the following while loop:
factorial := 1 counter := 1 while counter <= 5: factorial := factorial * counter counter := counter + 1
This translation is slightly complicated by languages which allow a statement to jump to the next iteration of the loop (such as the "continue" statement in C). These statements will typically implicitly increment the counter of a for loop, but not the equivalent while loop (since in the latter case the counter is not an integral part of the loop construct). Any translation will have to place all such statements within a block that increments the explicit counter before running the statement.
[edit] Syntax
Given an action that must be repeated, for instance, five times, different languages' for loops will be written differently. The syntax for a three-expression for loop is nearly identical in all languages that have it, after accounting for different styles of block termination and so on (example is same for C or java):
for (counter = 1; counter <= 5; counter++) //statements;
The numeric-range for loop varies somewhat more. Pascal would write it:
for Counter := 1 to 5 do (*statements*);
Whereas Perl would use:
for ($counter = 1; $counter <= 5; ++$counter) { # statements; }
(Note that for(1..5) { }
is really a foreach in Perl.)
Iterator for loops most commonly take a form such as this (example is Python):
for counter in range(1, 6): # range(1, 6) gives values from 1 inclusive to 6 exclusive # statements
But the PHP equivalent (virtually never used for simple repetition but noted here for completeness):
foreach (range(1,5) as $i) # statements;
Contrary to other languages, in Smalltalk a for loop is not a language construct but defined in the class Number as a method with two parameters, the end value and a closure, using self as start value.
1 to: 5 do: [ :counter | "statements" ]
[edit] Timeline of for loop in various programming languages
[edit] 1966: FORTRAN 66
FORTRAN 66's equivalent of the for
loop is the DO
loop. The syntax of Fortran's DO
loop is:
DO label counter=start, stop, step label statements
Example:
! DO loop example PROGRAM MAIN SUM SQ=0 DO 101 I=1,9999999 IF (SUM SQ.GT.1000) GO TO 109 SUM SQ=SUM SQ+I**2 101 CONTINUE 109 CONTINUE END
[edit] 1968: Algol68
Algol68 has what was considered the universal loop, the full syntax is:
for i from 1 by 1 to 3 while i≠4 do ~ od
There are several unusual aspects of the construct
- only the do ~ od portion was compulsory, in which case the loop will iterate indefinitely.
- thus the clause to 100 do ~ od, will iterate only 100 times.
- the while "syntactic element" allowed a programmer to break from a for loop early, as in:
int sum sq:=0; for i while print (("So far:",i, newline)); sum sq≤1000 do sum sq+:=i↑2 od
Subsequent "extensions" to the standard Algol68 allowed the to syntactic element to be replaced with upto and downto to achieve a small optimization. The same compilers also incorporated:
- until - for late loop termination.
- foreach - for working on arrays in parallel.
[edit] 1977: FORTRAN 77
FORTRAN 77's equivalent of the for
loop is the DO
loop. The syntax of Fortran's DO
loop is:
DO label, counter=start, stop, step label statements
Example:
000 PROGRAM MAIN SUM SQ=0 DO 101, I=1,9999999 IF (SUM SQ.GT.1000) GO TO 109 SUM SQ=SUM SQ+I**2 101 CONTINUE 109 CONTINUE END
[edit] 1983: Ada 83 and above
procedure Main is Sum_Sq : Integer := 0; begin for I in 1 .. 9999999 loop if Sum_Sq <= 1000 then Sum_Sq := Sum_Sq + I**2; end if; end loop; end Main
[edit] See also
[edit] External links
For loop implementation in different languages at Wikia:Code