Man or boy test

The man or boy test was proposed by computer scientist Donald Knuth as a means of evaluating implementations of the ALGOL 60 programming language. The aim of the test was to distinguish compilers that correctly implemented "recursion and non-local references" from those that did not.

There are quite a few ALGOL60 translators in existence which have been designed to handle recursion and non-local references properly, and I thought perhaps a little test-program may be of value. Hence I have written the following simple routine, which may separate the man-compilers from the boy-compilers.
 

Knuth's example

begin
  real procedure A(k, x1, x2, x3, x4, x5);
  value k; integer k;
  begin
    real procedure B;
    begin k := k - 1;
          B := A := A(k, B, x1, x2, x3, x4);
    end;
    if k <= 0 then A := x4 + x5 else B;
  end;
  outreal(A(10, 1, -1, -1, 1, 0));
end;

This creates a tree of B call frames that refer to each other and to the containing A call frames, each of which has its own copy of k that changes every time the associated B is called. Trying to work it through on paper is probably fruitless, but the correct answer is 67, despite the fact that in the original paper Knuth conjectured it to be 121. The survey paper by Charles H. Lindsey mentioned in the references contains a table for different starting values. Even modern machines quickly run out of stack space for larger values of k, which are tabulated below (A132343).[2]

k 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
A(k, 1, -1, -1, 1, 0) 1 0 -2 0 1 0 1 -1 -10 -30 -67 -138 -291 -642 -1,446 -3,250 -7,244 -16,065 -35,601 -78,985 -175,416 -389,695 -865,609 -1,922,362 -4,268,854 -9,479,595 -21,051,458

Explanation

There are three Algol features used in this program that can be difficult to implement properly in a compiler:

  1. Nested function definitions: Since B is being defined in the local context of A, the body of B has access to symbols that are local to A — most notably k which it modifies, but also x1, x2, x3, x4, and x5. This is straightforward in the Algol descendant Pascal, but not possible in the other major Algol descendant C (without manually simulating the mechanism by using C's address-of operator, passing around pointers to local variables between the functions).
  2. Function references: The B in the recursive call A(k,B,x1,x2,x3,x4) is not a call to B, but a reference to B, which will be called only when it appears as x4 or x5 in the statement A:=x4+x5. This is straightforward in standard Pascal (ISO 7185), and also in C. Some variants of Pascal (i.e. Turbo Pascal) do not support functions references, but when the set of functions that may be referenced is known beforehand (in this program it is only B), this can be worked around.
  3. Constant/function dualism: The x1 through x5 parameters of A may be numeric constants or references to the function B — the x4+x5 expression must be prepared to handle both cases as if the formal parameters x4 and x5 had been replaced by the corresponding actual parameter (call by name). This is probably more of a problem in statically typed languages than in dynamically typed languages, but the standard work-around is to reinterpret the constants 1, 0, and −1 in the main call to A as functions without arguments that return these values.

These things are however not what the test is about; they're merely prerequisites for the test to at all be meaningful. What the test is about is whether the different references to B resolve to the correct instance of B — one which has access to the same A-local symbols as the B which created the reference. A "boy" compiler might for example instead compile the program so that B always accesses the topmost A call frame.

See also

References

  1. Donald Knuth (July 1964). "Man or boy?". Retrieved Dec 25, 2009.
  2. See Performance and Memory on the Rosetta Code Man or Boy Page

External links