Man or boy test

From Wikipedia, the free encyclopedia

The man or boy test was proposed by computer scientist Donald Knuth as a means of evaluating implementations of the ALGOL 60 programming language. The aim of the test was to distinguish compilers that correctly implemented "recursion and non-local references" from those that did not.

I have written the following simple routine, which may separate the "man-compilers" from the "boy-compilers" - Donald Knuth[1].

Contents

[edit] Knuth's example

begin
  real procedure A (k, x1, x2, x3, x4, x5);
  value k; integer k;
  begin
    real procedure B;
    begin k:= k - 1;
          B:= A := A (k, B, x1, x2, x3, x4);
    end;
    if k <= 0 then A:= x4 + x5 else B;
  end;
  outreal (A (10, 1, -1, -1, 1, 0));
end;

This creates a tree of B call frames that refer to each other and to the containing A call frames, each of which has its own copy of k that changes every time the associated B is called. Trying to work it through on paper is probably fruitless, but the correct answer is −67, despite the fact that in the original paper Knuth conjectured it to be −121. The survey paper by Charles H. Lindsey mentioned in the references contains a table for different starting values. Even modern machines quickly run out of stack space for larger values of k:

k 0 1 2 3 4 5 6 7 8 9 10 11
A 1 0 -2 0 1 0 1 -1 -10 -30 -67 -138

[edit] Other languages

Charles H. Lindsey implemented the algorithm in ALGOL 68, and - as call by name is not necessary - the same algorithm can be implemented in many languages including Pascal and PL/I[2]. Here are implementations in C (which requires simulating closures), Haskell, Lisp, Python, Smalltalk[3], Ruby.

[edit] ALGOL 68

BEGIN
 PROC a = (REAL in k, PROC REAL xl, x2, x3, x4, x5) REAL:
 BEGIN
   REAL k := in k;
   PROC b = REAL:
   BEGIN k := k - 1;
         a(k, b, xl, x2, x3, x4)
   END;
   IF k<=0 THEN x4 + x5 ELSE b FI
 END;
 printf(($+2d.8d$, a(10, REAL:1, REAL:-1, REAL:-1, REAL:1, REAL:0)))
END

[edit] C

Even if closures are not available in a language, their effect can be simulated. This is what happens in the following C implementation:

/* man-or-boy.c */
 #include <stdio.h>
 #include <stdlib.h>
 
 // --- thunks
 typedef struct arg {
   int       (*fn)(struct arg*);
   int        *k;
   struct arg *x1, *x2, *x3, *x4, *x5;
 } ARG;
 
 // --- lambdas
 int f_1 (ARG* _) { return -1; }
 int f0  (ARG* _) { return  0; }
 int f1  (ARG* _) { return  1; }
 
 // --- helper
 int eval(ARG* a) { return a->fn(a); }
 #define ARG(...) (&(ARG){ __VA_ARGS__ })
 #define FUN(...) ARG(B,&k,__VA_ARGS__)
 
 // --- functions
 int B(ARG* a) {
   int A(ARG*);
   int k = *a->k -= 1;
   return A( FUN(a,a->x1,a->x2,a->x3,a->x4) );
 }
 
 int A(ARG* a) {
   return *a->k <= 0 ? eval(a->x4)+eval(a->x5) : B(a);
 }
 
 int main(int argc, char **argv) {
   int k = argc == 2 ? strtol(argv[1],0,0) : 10;
   printf("%d\n", A( FUN(ARG(f1),ARG(f_1),ARG(f_1),ARG(f1),ARG(f0)) ));
 }

[edit] C++

Uses "shared_ptr" smart pointers from Boost / TR1 to automatically deallocate objects. Since we have an object which needs to pass a pointer to itself to another function, we need to use "enable_shared_from_this".

#include <iostream>
#include <tr1/memory>
using std::tr1::shared_ptr;
using std::tr1::enable_shared_from_this;
 
struct Arg {
  virtual int run() = 0;
  virtual ~Arg() { };
};
 
int A(int, shared_ptr<Arg>, shared_ptr<Arg>, shared_ptr<Arg>,
      shared_ptr<Arg>, shared_ptr<Arg>);
 
class B : public Arg, public enable_shared_from_this<B> {
private:
  int k;
  const shared_ptr<Arg> x1, x2, x3, x4;
 
public:
  B(int _k, shared_ptr<Arg> _x1, shared_ptr<Arg> _x2, shared_ptr<Arg> _x3,
    shared_ptr<Arg> _x4)
    : k(_k), x1(_x1), x2(_x2), x3(_x3), x4(_x4) { }
  int run() {
    return A(--k, shared_from_this(), x1, x2, x3, x4);
  }
};
 
class Const : public Arg {
private:
  const int x;
public:
  Const(int _x) : x(_x) { }
  int run () { return x; }
};
 
int A(int k, shared_ptr<Arg> x1, shared_ptr<Arg> x2, shared_ptr<Arg> x3,
      shared_ptr<Arg> x4, shared_ptr<Arg> x5) {
  if (k <= 0)
    return x4->run() + x5->run();
  else {
    shared_ptr<Arg> b(new B(k, x1, x2, x3, x4));
    return b->run();
  }
}
 
int main() {
  std::cout << A(10, shared_ptr<Arg>(new Const(1)),
                 shared_ptr<Arg>(new Const(-1)),
                 shared_ptr<Arg>(new Const(-1)),
                 shared_ptr<Arg>(new Const(1)),
                 shared_ptr<Arg>(new Const(0))) << std::endl;
  return 0;
}

[edit] Haskell

Haskell is a pure language, so the impure effects of updating k must be wrapped in a state monad.

import Control.Monad.ST
import Data.STRef

type S s = ST s Integer

a :: Integer -> S s -> S s -> S s -> S s -> S s -> S s
a k x1 x2 x3 x4 x5 = a' where
  a' | k <= 0    = do { x4' <- x4; x5' <- x5; return (x4' + x5') }
     | otherwise = do { kr <- newSTRef k; b kr }
  b kr = do
    k <- readSTRef kr
    let k' = k - 1
    writeSTRef kr k'
    a k' (b kr) x1 x2 x3 x4

run k =
  runST (a k (return 1) (return (-1)) (return (-1)) (return 1) (return 0))


[edit] Java

We use anonymous classes to represent closures.

public class ManOrBoy {
    interface Arg {
        int run();
    }
 
    static int A(final int k, final Arg x1, final Arg x2, final Arg x3, final Arg x4, final Arg x5) {
        if (k <= 0) return x4.run() + x5.run();
        else {
            Arg b = new Arg() {
                int m = k;
                public int run() {
                    return A(--m, this, x1, x2, x3, x4);
                }
            };
            return b.run();
        }
    }
 
    public static void main(String[] args) {
        System.out.println(A(10, new Arg() { public int run() { return 1; } },
                             new Arg() { public int run() { return -1; } },
                             new Arg() { public int run() { return -1; } },
                             new Arg() { public int run() { return 1; } },
                             new Arg() { public int run() { return 0; } }));
    }
}

[edit] JavaScript

function A( k, x1, x2, x3, x4, x5 ) {
  function B() {
    k -= 1;
    return A( k, B, x1, x2, x3, x4 );
  }
  if( k <= 0 ) return x4() + x5();
  return B();
}
 
function lambda( value ) {
  return function() { return value };
}
 
alert( A( 10, lambda(1), lambda(-1), lambda(-1), lambda(1), lambda(0) ) );


[edit] Lisp

The clarity of the Common Lisp version of the program suffers because Common Lisp differentiates between a functional and a non-functional value for its variables (that's why function and funcall are needed). On the other hand, in Scheme, it being simpler and its invention more strongly influenced by Algol60, the solution is straight-forward.

[edit] Common Lisp

(defun a (k x1 x2 x3 x4 x5)
  (labels ((b ()
              (setq k (- k 1))
              (a k (function b) x1 x2 x3 x4)))
          (if (<= k 0)
              (+ (funcall x4) (funcall x5))
              (b))))
 
(a 10 (lambda () 1) (lambda () -1) (lambda () -1) (lambda () 1) (lambda () 0))

[edit] Scheme

(define (A k x1 x2 x3 x4 x5)
  (define (B)
    (set! k (- k 1))
    (A k B x1 x2 x3 x4))
  (if (<= k 0)
      (+ (x4) (x5))
      (B)))
 
(A 10 (lambda () 1) (lambda () -1) (lambda () -1) (lambda () 1) (lambda () 0))

[edit] Mathematica

This Mathematica code was derived from the Ruby example appearing below.

$RecursionLimit = 1665; (* anything less fails for k0 = 10 *)
 
a[k0_, x1_, x2_, x3_, x4_, x5_] :=
  Module[{k, b},
    k = k0;
    b = (k--; a[k, b, x1, x2, x3, x4]) &;
    If[k <= 0, x4[] + x5[], b[]]]
 
a[10, 1 &, -1 &, -1 &, 1 &, 0 &] (* => -67 *)

[edit] OCaml

OCaml variables are not mutable, so "k" is wrapped in a mutable object, which we access through a reference type called "ref".

let rec a k x1 x2 x3 x4 x5 =
  if k <= 0 then
    x4 () + x5 ()
  else
    let m = ref k in
    let rec b () =
      decr m;
      a !m b x1 x2 x3 x4
    in
    b ()
 
let () =
  Printf.printf "%d\n" (a 10 (fun () -> 1) (fun () -> -1) (fun () -> -1) (fun () -> 1) (fun () -> 0))

[edit] PL/I

morb: proc options (main) reorder;
 dcl sysprint file;

 put skip list(a((10), lambda1, lambda2, lambda3, lambda4, lambda5));

 a: proc(k, x1, x2, x3, x4, x5) returns(fixed bin (31)) recursive;
   dcl k                    fixed bin (31);
   dcl (x1, x2, x3, x4, x5) entry returns(fixed bin (31));

   b: proc returns(fixed bin(31)) recursive;
     k = k - 1;
     return(a((k), b, x1, x2, x3, x4));
   end b;

   if k <= 0 then
     return(x4 + x5);
   else
     return(b);
 end a;

 lambda1: proc returns(fixed bin (31)); return(1);  end lambda1;
 lambda2: proc returns(fixed bin (31)); return(-1); end lambda2;
 lambda3: proc returns(fixed bin (31)); return(-1); end lambda3;
 lambda4: proc returns(fixed bin (31)); return(1);  end lambda4;
 lambda5: proc returns(fixed bin (31)); return(0);  end lambda5;
end morb;

[edit] Python

import sys
sys.setrecursionlimit(1027)
 
def A(lk, x1, x2, x3, x4, x5 ):
    k = [lk]
    def B():
        k[0] -= 1
        return A(k[0], B, x1, x2, x3, x4)
    if k[0] <= 0:
        return x4() + x5()
    return B()
 
print A(10, lambda:1, lambda:-1, lambda:-1, lambda:1, lambda:0)

[edit] Perl

sub A {
    my ($k, $x1, $x2, $x3, $x4, $x5) = @_;
    my $B;
    $B = sub { A(--$k, $B, $x1, $x2, $x3, $x4) };
    if ($k <= 0) {
        return &$x4 + &$x5;
    }
    return &$B;
}
 
print A(10, sub{1}, sub {-1}, sub{-1}, sub{1}, sub{0} ) . "\n";

[edit] Ruby

Note: the lambda call can be replaced with Proc.new and still work.

def a(k, x1, x2, x3, x4, x5)
  b = lambda { k -= 1; a(k, b, x1, x2, x3, x4) }
  k <= 0 ? x4[] + x5[] : b[]
end
 
puts a(10, lambda {1}, lambda {-1}, lambda {-1}, lambda {1}, lambda {0})

[edit] Scala

This implemetation in Scala demonstrates the power of this statically typed language.

def A(in_k: Int, x1: =>Int, x2: =>Int, x3: =>Int, x4: =>Int, x5: =>Int): Int = {
    var k = in_k
    def B: Int = {
        k = k-1
        A(k, B, x1, x2, x3, x4)
    }
    if (k<=0) x4+x5 else B
}
println(A(10, 1, -1, -1, 1, 0))

[edit] Smalltalk

Number>>x1: x1 x2: x2 x3: x3 x4: x4 x5: x5
    | b k |
    k := self.
    b := [ k := k - 1. k x1: b x2: x1 x3: x2 x4: x3 x5: x4 ].
    ^k <= 0 ifTrue: [ x4 value + x5 value ] ifFalse: b
 
 10 x1: [1] x2: [-1] x3: [-1] x4: [1] x5: [0]


[edit] Tcl

There are two nontrivial features in the "man or boy" test. One is that the parameters x1 though x5 are in general going to be function calls that don't get evaluated until their values are needed for the addition in procedure A, which means that these in Tcl are going to be scripts, and therefore it is necessary to introduce a helper procedure C that returns a constant value. The other is that procedure B needs to refer to variables in the local context of its "parent" instance of procedure A. This is precisely what the upvar core command does, but the absolute target level needs to be embedded into the script that performs the delayed call to procedure B (upvar is more often used with relative levels).

proc A {k x1 x2 x3 x4 x5} {
    expr {$k<=0 ? [eval $x4]+[eval $x5] : [B \#[info level]]}
}
proc B {level} {
    upvar $level k k x1 x1 x2 x2 x3 x3 x4 x4
    incr k -1
    A $k [info level 0] $x1 $x2 $x3 $x4
}
proc C {val} {return $val}
interp recursionlimit {} 1157
A 10 {C 1} {C -1} {C -1} {C 1} {C 0}

The [info level 0] here is a sort of "self" idiom; it returns the command (with arguments) that called the current procedure.

Since the values of x1 through x4 are never modified, it is also possible to embed these as parameters of B, thereby slightly purifying the program:

proc AP {k x1 x2 x3 x4 x5} {expr {$k<=0 ? [eval $x4]+[eval $x5] : [BP \#[info level] $x1 $x2 $x3 $x4]}}
proc BP {level x1 x2 x3 x4} {AP [uplevel $level {incr k -1}] [info level 0] $x1 $x2 $x3 $x4}
proc C {val} {return $val}
interp recursionlimit {} 1157
AP 10 {C 1} {C -1} {C -1} {C 1} {C 0}

[edit] See also

[edit] References

  1. ^ Donald Knuth (Jul 1964). Man or boy?. Retrieved on May 2, 2007.
  2. ^ Charles H. Lindsey (Dec 1988). Block Structure and Environments. Retrieved on May 2, 2007.
  3. ^ "Man or boy" test.

[edit] External links