Algorithm examples

From Wikipedia, the free encyclopedia

This article Algorithm examples supplements Algorithm and Algorithm characterizations.

Contents

[edit] An example: Algorithm specification of addition m+n

Choice of machine model:

This problem has not been resolved by the mathematics community. There is no "best", or "preferred" model. The Turing machine, while considered the sine qua non --the indispensible, the ultimate fall-back -- is notoriously opaque when confronted face-to-face. And different problems seem to require different models to study them. Many researchers have observed these problems, for example:

"The principal purpose of this paper is to offer a theory which is closely related to Turing's but is more economical in the basic operations" (Wang (1954) p. 63)
"Certain features of Turng machines have induced later workers to propose alternative devices as embodiments of what is to be meant by effective computability.... a Turing machine has a certain opacity, its workings are known rather than seen. Further a Turing machine is inflexible ... a Turing machine is slow in (hypothetical) operation and, usually complicated. This makes it rather hard to design it, and even harder to investigate such matters as time or storage optimization or a comparison between efficiency of two algorithms." (Melzak (1961) p. 281)
Shepherdson-Sturgis (1963) proposed their register-machine model because "these proofs [using Turing machines] are complicated and tedious to follow for two reasons: (1) A Turing machine has only one head... (2) It has only one tape...." They were in search of "a form of idealized computer which is sufficiently flexible for one to be able to convert an intuitive computational procedure into a program for such a machine" (p. 218).
"I would prefer something along the lines of the random access compuers of Angluin and Valiant [as opposed to the pointer machine of Schönhage]" (Gurivich 1988 p. 6)
"Showing that a function is Turing computable directly...is rather laborious ... we introduce an ostensibly more flexible kind of idealized machine, an abacus machine..." (Boolos-Burgess-Jeffrey 2002 p.45).

About all that one can insist upon is that the algorithm-writer specify in exacting detail (i) the machine model to be used and (ii) its instruction set.

Atomization of the instruction set:

The Turing machine model is primitive, but not as primitive as it can be. As noted in the above quotes this is a source of concern when studying complexity and equivalence of algorithms. Although the observations quoted below concern the Random access machine model -- a Turing-machine equivalent -- the problem remains for any Turing-equivalent model:

"...there hardly exists such a thing as an "innocent" extension of the standard RAM model in the uniform time measure; either one only has additive arithmetic, or one might as well include all multiplicative and/or bitwise Boolean instructions on small operands...." (van Emde Boas (1992) p. 26)
"Since, however, the computational power of a RAM model seems to depend rather sensitively on the scope of its instruction set, we nevertheless will have to go into detail...
"One important principle will be to admit only such instructions which can be said to be of an atomistic nature. We will describe two versions of the so-called successor RAM, with the successor function as the only artihmetic operation....the RAM0 version deserves special attention for its extreme simplicity; its instruction set consists of only a few one letter codes, without any (explicit) addressing." (Schönhage (1980) p.494)

Example #1: The most general (and original) Turing machine -- single-tape with left-end, multi-symbols, 5-tuple instruction format -- can be atomized into the Turing machine of Boolos-Burgess-Jeffrey (2002) -- single-tape with no ends, two symbols { B, | }, 4-tuple instruction format. This model in turn can be further atomized into a Post-Turing machine -- single-tape with no ends, two symbols { B, | }, 0- and 1-parameter instruction set.

Example #2: The RASP can be reduced to a RAM by moving its instructions off the tape and (perhaps with translation) into its finite-state machine "table" of instructions, the RAM stripped of its indirect instruction and reduced to a 2- and 3-operand "abacus" register machine; the abacus in turn can be reduced to the 1- and 2-operand Minsky (1967)/Shepherdson-Sturgis (1963) counter machine, which can be further atomized into the 0- and 1-operand instructions of Schönhage (and even a 0-operand Schönhage-like instruction set is possible!).

Cost of atomization:

Atomization comes at a (usually severe) cost: while the resulting instructions may be "simpler", atomization (usually) creates more instructions and the need for more computational steps. As shown in the following example the increase in computation steps may be significant (i.e. orders of magnitude -- the following example is "tame"), and atomization may (but not always, as in the case of the Post-Turing model) reduce the usability and readability of "the machine code". For more see Turing tarpit.

Example: The single register machine instruction "INC 3" -- increment the contents of register #3, i.e. increase its count by 1 -- can be atomized into the 0-parameter instruction set of Schönhage, but with the equivalent number of steps to accomplish the task increasing to 7; this number is directly related to the register number "n" i.e. 4+n):

Label Instruction mnemonic Contents of register A Contents of register-N Contents of register 3 Description
start:  ?  ? 36
Z 0  ? 36 Clear contents of A-register to Zero
A 1  ? 36 Increment A: Add one to the contents of A-register
A 2  ? 36 Increment A: Add one to the contents of A-register
A 3  ? 36 Increment A: Add one to the contents of A-register
N 3 3 36 Copy contents of A to N
L 36 3 36 Load register A with contents of register x whose address is in contained in N
A 37 3 36 Increment A: Add one to the contents of A-register
S 37 3 37 Store contents of A-register in register whose address is contained in N
done: etc.

More examples can be found at the pages Register machine and Random access machine where the addition of "convenience instructions" CLR h and COPY h1,h1 are shown to reduce the number of steps dramatically. Indirect-addressing is the other significant example.

[edit] Precise specification of Turing-machine algorithm m+n

As described in Algorithm characterizations per the specifications of Boolos-Burgess-Jeffrey (2002) and Sipser (2006), and with a nod to the other characterizations we proceed to specify:

(i) Number format: unary strings separated by single blanks e.g. "2,3" = B11B111B
(ii) Machine type: Turing machine: single-tape left-ended or no-ended, 2-symbol { B, | }, 4-tuple instruction format.
(iii) Head location: See more at "Implementation Description" below. A symbolic representation of the head's location in the tape's symbol string will put the current state to the right of the scanned symbol. Blank squares may be included in this protocol. The state's number will appear with brackets around it, or sub-scripted. The head is shown as state #2, to the right of the second-from-left "1" :
Example: B11[2]1B111B or B1121B111B
(iii) Location of "the program": This program will be in the finite state machine's TABLE and appear there in quadruple (4-tuple) format, suitably coded so the finite state machine can interpret them (see Footnote: Coding instructions for the finite state machine).
(iv) The program: Boolos-Burgess-Jeffrey (2002) do not offer us one, however, we can devise one. We will note that it is not be as efficient as it could be; however, it works. See "Formal Description" below. (A "better" program will be found in the second example.)

Per the discussion of Sipser (2006) (p. 157) we will provide the three levels of description:

High-level description:

The computation m+n adds two blocks of unary marks (strokes) to produce a single block that represents their sum. At the start the blocks are separated by a single blank. The machine will hunt for this blank and fill it in. This solid block of marks constitutes the sum+1. The machine will hunt for the far-right end of the sum and erase one (the extra) far-right mark. Finally, it will proceed to the left end and halt with the head on the left-most mark.

Implementation description:

Implementation specification:
(a) Initial number format: The numbers n and m will be represented as close-packed blocks of unary marks, where number "0" = no mark and no blank, "1" = 1 mark, etc. The numbers n and m will be separated by a single blank, both from each other and any other parameters that happen to be on the tape:
Example: 2+3 = B11B111B
Example: 0+3 = B111B
(b) Inital head location, initial state: Case m>0: Initially the machine will be scanning the leftmost 1 on the tape, e.g. parameter m. Case m=0: If parameter m is 0, or both m and n are 0, then the head will be where a mark would be if there were to be a mark, i.e. to the left of the spacer-blank. The initial state will be "1":
Example: 2+3 = B111B111B
Example: 0+3 = BB1111B
(c) Successful computation -- number format at Halt: The machine will halt on a single block of marks that represent the sum in the conventional sense. This will be true for m+0 and 0+n. In the case of 0+0 the sum will be "null", neither blank nor mark.
Example: 5 = B11111B
Example: 0 = BB
(d) Successful computation -- head location at Halt: The machine will halt scanning the left-most 1 in the block that represents the sum:
Example: 2+3 = 5 = B1k1111B
Example: 2+0 = 2 = B1k1BB
Example: 0+3 = 3 = BB1k11B
Example: 0+0 = 0 = BkB
(e) Unsuccessful computation: The machine will never halt or will halt in some non-standard configuration:
Example: Bk11111B, or B11k111B, or B11111kB, etc.
Implementation description
  • State 1: Test for 0+n and by implication 0+0; if the head starts over a blank square, move the head once right and halt. If the tape is marked move the tape right once and go to state 2.
  • State 2: If the tape is blank then the head has arrived at the space-blank between the two integers. Print a mark, then go to state 3. If the tape is marked, continue the hunt for a blank by moving the head to the right and looping through state 2.
  • State 3: State "2 & B" has previously found the spacer blank and filled it in with a mark. While the tape is marked continue moving the head to the right while hunting for the blank to the right of the right-most number "n". When the head finds this blank to the right of "n" (state 3 & 0), move the tape head left once and go to state 4.
  • State 4: If the scanned square is marked, then the head is at the far right end of the sum. Erase this extra mark, then loop back to 4. The head will now be on a blank so move the head left once and jump to state 5.
  • State 5: Hunt for the left end of the sum: while the scanned square is marked, move the head left and loop to state 5. When the head reaches the far left end it will be on a blank. Move the tape right once to be over the left-most mark and halt.

Formal description:

The state table written in (unreduced) quadruple-form:

state qj Scanned symbol Action next state qk state qn Scanned symbol Action next state qk
1 B R H 1 1 R 2
2 B P 3 2 1 R 2
3 B L 4 3 1 R 3
4 B L 5 4 1 E 4
5 B R H 5 1 L 5

Sample "run" of the Turing-machine program: The program has been turned on its side; time runs down the page. The position of the head is high-lighted yellow. The "complete configuration" or the "total machine/system state" (in the context of Blass-Gurevich's "evolution of the state") is shown in the far-right column -- the instruction-number (finite-state machine's "state") is the number inside the brackets to the right of the scanned symbol (rather than a subscript to the right, for technical reasons):

State qj 1 2 3 4 5
Scanned symbol B B B B B
B-action R P L L R
B-next state qk: H 3 4 5 H
State qj 1 2 3 4 5
Scanned symbol 1 1 1 1 1
1-Action R R R E L
1-next state qk: 2 2 3 4 5
Step Instruction Next Inst# Total state
0 start B B B B 1 1 B 1 1 1 B B B 1 1[1]1B111
1 R B B B B 1 1 B 1 1 1 B B B R 2 11[2]B111
2 R B B B B 1 1 B 1 1 1 B B B R 2 11B[2]111
3 P B B B B 1 1 1 1 1 1 B B B P 3 111[3]111
4 R B B B B 1 1 1 1 1 1 B B B R 3 1111[3]11
5 R B B B B 1 1 1 1 1 1 B B B R 3 11111[3]1
6 R B B B B 1 1 1 1 1 1 B B B R 3 111111[3]
7 R B B B B 1 1 1 1 1 1 B B B R 3 111111B[3]
8 L B B B B 1 1 1 1 1 1 B B B L 4 111111[4]
9 E B B B B 1 1 1 1 1 B B B B E 4 11111B[4]
10 L B B B B 1 1 1 1 1 B B B B L 5 11111[5]
11 L B B B B 1 1 1 1 1 B B B B L 5 1111[5]1
12 L B B B B 1 1 1 1 1 B B B B L 5 111[5]11
13 L B B B B 1 1 1 1 1 B B B B L 5 11[5]111
14 L B B B B 1 1 1 1 1 B B B B L 5 1[5]1111
15 L B B B B 1 1 1 1 1 B B B B L 5 B[5]11111
16 R B B B 0 1 1 1 1 1 B B B B R H 1[H]1111

(NOTE: The reader may find the equivalent Post-Turing machine program easier to read and simulate. In this example the head moves; just reverse L and R for the case when the tape moves):

Instruction #: 1 2 3 4 5 6 7 8 9 10 11 12
Instruction: J0 R J1 P R J1 L E L J1 R H
Jump-to #: 11 2 5 9

[edit] The "quality" of an algorithm: a "better" version of Turing-machine addition m+n

We will see from the example below that the following addition-algorithm is "better" for computing m+n than the one specified above, at least from the quality standpoint of requiring (i) less space (fewer instructions: 4 versus 5 states), and (ii) less time (9 versus 16 steps). Whether or not the algorithm below is better than the above in terms of reliability is another question. Because it does not explore the 2nd number (i.e. "n") it cannot halt in a "non-standard" or fail-to-compute-correctly state.

We conclude from this example that although the algorithms result in the same output (end state) they perform differently. In fact there are many other possible algorithms [need reference here: B-B-J 2002 discuss this somewhere in the text]. However comparisons and measurements of "quality" must be made relative to an a priori standard [need reference, quotes here: Knuth 1972 discusses this briefly].

High-level description #1:

Because the unary numbers to be added, e.g. m+n, are separated by a single blank the machine will hunt for this blank and fill it in. It will then move the head left to the blank at the left of the sum, move the head right once, erase the mark there, move the head right once again and halt. At the beginning a test is provided for the case of 0+n (and by implication, 0+0).


Formal description #3: The state table written in quadruple-form:

state qj Scanned symbol Action next state qk state qn Scanned symbol Action next state qk
1 B R H 1 1 R 2
2 B P 3 2 1 R 2
3 B R 4 3 1 L 3
4 B R H 4 1 E 4

An example run of the algorithm:

State qj 1 2 3 4
Scanned symbol B B B B
B-action R P R R
B-next state qk: H 3 4 H
State qj 1 2 3 4
Scanned symbol 1 1 1 1
1-Action R R L E
1-next state qk: 2 2 3 4
Step Instruction Next Inst# Total state
0 start B B B B 1 1 B 1 1 1 B B B 1 1[1]1B111
1 R B B B B 1 1 B 1 1 1 B B B R 2 11[2]B111
2 R B B B B 1 1 B 1 1 1 B B B R 2 11B[2]111
3 P B B B B 1 1 1 1 1 1 B B B P 3 111[3]111
4 L B B B B 1 1 1 1 1 1 B B B L 3 11[3]1111
5 L B B B B 1 1 1 1 1 1 B B B L 3 1[3]11111
6 L B B B B 1 1 1 1 1 1 B B B L 3 B[3]111111
7 R B B B B 1 1 1 1 1 1 B B B R 4 1[4]11111
8 E B B B B B 1 1 1 1 1 B B B E 4 B[4]11111
9 R B B B B B 1 1 1 1 1 B B B R H 1[H]1111

[edit] Example: Counter-machine versus Turing-machine computation

This example is even more detailed than the above, because here the counter machine is specified to be a simulation in an EXCEL spread sheet. Where the "user" puts the parameters in the spreadsheet does matter.

We also observe that, while superficially similar, the core of the two algorithms is truly different: in this example the counter machine has to increment "m" times its parameter "n". (i.e. to add "m"=three to "n", increment "n" three times). In the Turing machine case the machine just had to fill in the first blank it came to, then find its way "home". It could have halted after filling in if we had specified this as acceptable. But the counter machine algorithm does not have this option.

Thus we conclude that: the algorithm is machine dependent.

(i) Number format: Positive integers (and 0) expressed in conventional decimal notation, one (unbounded) decimal integer per register. Zero is considered to be numeral 0 -- the empty count, null, "blank B". The numerals themselves are simply abbreviations for unary counts within the registers:
"0" = B; "1" = |, "2" = | |, "3" = | | |
(iia) (Virtual, simulated) machine type: The machine to be used in the computation as a counter machine is equipped with only 3 registers (alternately: "Only three registers #0, #1, #2 will be used"). The instruction set will be the following three Minsky (1967) instructions:
INC [ r ] ; INCrement contents of register h, proceed to next instruction in sequence: [ r ] +1 → r , & [ state-machine instruction counter ]+1 → state-machine instruction counter
DEC [ r ] ;DECrement contents of register h, proceed to next instruction in sequence: [ r ] -1 → r , & [ state-machine instruction counter ] +1 → [ state-machine instruction counter
JZ [ h, z ] ;Jump if contents of register r is Zero then go to instruction z else proceed to next instruction in sequence.
H ; HALT
(ib) Target (actual) machine: An EXCEL spreadsheet run on an Intel Pentium processor decodes/interpretates the instructions and "runs" the (simulated) counter machine. The instructions -- one per cell -- are represented by symbol strings, for example, "I N C" or "i n c"; the instruction-operands are represented by decimal numeral symbols "3", "14", etc.
(iii) Head Location: No tape head is required. The instructions' operands -- one per cell -- are explicit names of the registers upon which the action will be taken and/or explicit jump-to addresses.
(iv) Location of the Program: The (virtual-) counter-machine program will be in its finite-state TABLE of instructions: In the EXCEL simulation this virtual program is located in the cells that represent the state-machine TABLE of instructions.
(iv) The program: A formal program listing is shown below.

High-level description:

Repeat until register #2 is 0:
While counting down the contents of register #2 to zero, count up the contents of register #1. The sum will accumulate in register #1.

Implementation description :

Implementation specification:
(a) Input format: The two arguments (m, n) of function "m+n" (e.g. "sum(m, n)") will be represented as integers expressed in conventional decimal notation, one (unbounded) decimal numeral per register. Unused registers will be considered to be blank/empty/of zero-count. Register #0 is reserved to hold the numeral “0” (blank, empty, zero-count). The following symbolism [ 5 ] means "contents of register 5":
Example: integer 5 in register "3", [ 3 ] = symbol "5" = | | | | |
(b) Initial location of arguments: The first argument "m" will be register #1, the second argument "n" in register #2.
Example: 2+3, [ 0 ]=0, [ 1 ] =m, [ 2 ] =n
(c) Halt upon successful computation: : If the function m+n successfully assigns a value Σ to the arguments (m, n) that are represented initially on the tape, then the machine will eventually halt with the numeral Σ, represented in conventional decimal symbols, in a single register and otherwise-empty registers:
Example: m+n=∑, [ 0 ] =0, [ 1 ] =∑, [ 2 ] =0
(d) Halt upon successful computation: register specified to hold output:In this case [c] the machine will halt with the computation (sum Σ) in register #1:
Example: 3+2=5, [ 0 ] =0, [ 1 ] =5, [ 2 ] =0
(e) Unsuccessful computation: If the function that is to be computed assigns no value to the arguments that are represented initially in the registers, then the machine either will never halt, or will halt in some nonstandard configuration:
Example: [ 0 ] =3, [ 1 ] =0, [ 2 ] = 42, etc.

Implementation description:

  • steps 1, 5: Test for n+0 and by implication 0+0: if register #2 is zero, then goto step 5 ’’done’’ else continue:
  • steps 2, 3: Decrement register #2 and increment register #1; register #1 is accumulationg counts that will, upon halt, represent the sum.
  • step 4: Unconditionally jump back to step 1 as follows: Test constant-register #0 for 0 (i.e. [ 0 ] =0) and upon (always-)successful test jump back to step 1.

Formal description:

The program is written in “assembly code” mnemonics to be interpreted by an EXCEL program. Either upper- or lower-case may be used, but the spellings are mandatory. The format of how and where to put the code will be seen below and in the example:

Conventional assembly code listing:

"m+n":
1 JZ (2,5) ;[ 2 ]=m: If [ 2 ]=0 then instruction 5 else continue
2 DEC (2) ;Decrement count in register #2: [ 2 ] -1 → 2
3 INC (1) ;[ 1 ] =n or sum: Increment count in register #1: [ 1 ] +1 → 1
4 JZ (0,1) ; IF [ 0 =0 then jump to instruction 1 else continue.
5 HALT

The instruction "operation-codes" (assembly mnemonics) are to appear from left to right below their instruction numbers. The instruction "operands" (numerals) are to appear from left to right in separate cells below their mnemonics, as shown:

Instruction #: 1 2 3 4 5
Instruction: JZ DEC INC JZ H
register #: 2 2 1 0
Jump-to #: 5 1

An example of the run of the program; time runs down the page:

Instruction #: 1 2 3 4 5
Instruction: JZ DEC INC JZ H
register #: 2 2 1 0
Jump-to #: 5 1
Next Inst#
Instruction register # Jump-to # reg 0 reg 1 reg 2
start 0 2 3 1
JZ 2 5 0 2 3 JZ 2
DEC 2 0 0 2 2 DEC 3
INC 1 0 0 3 2 U 4
JZ 0 1 0 3 2 JZ 1
JZ 2 5 0 3 2 JZ 2
DEC 2 0 0 3 1 DEC 3
INC 1 0 0 4 1 INC 4
JZ 0 1 0 4 1 JZ 1
JZ 2 5 0 4 1 JZ 2
DEC 2 0 0 4 0 DEC 3
INC 1 0 0 5 0 INC 4
JZ 0 1 0 5 0 JZ 1
JZ 2 5 0 5 0 JZ 5
H 0 0 0 5 0 H 5

[edit] Footnotes

  • Footnote: Coding instructions for the finite state machine (FSM):

Models of machine computation such as the Turing machine and counter-machine are considered to be, in a sense, real machines. That is, their finite cousins have to be buildable if somone were so inclined. In the literature, examples of coding for the models usually stop at the tabular level (tables of 4- or 5-tuples) for Turing machines and at the mnemonic level for the various Register machine models

Example for the Post-Turing machine model: { L, R, E, P, J0 xxx; J1 xxx; H }.

In his paper (1936) Turing went a level further and encoded his tabular 5-tuples into a string of numbers chosen from 1-7. But this encoding was for the instructions of his "universal machine" to be put on the tape. He did not go into much detail with regards to encoding instructions for the finite state machine.

The standard methods of bottom-level encoding are of only a few types and always use numbers, because symbols other than numbers can be easily translated into number-equivalents (cf Boolos-Burgess-Jeffrey):

  • Unary, e.g. "3" = | | |, or more likely if "0" = |, "3" = | | | |
  • One of n: where each column is similar to "a wire" with a condition of "1" or "0 = empty". A number is thus represented as an ordinal.
number 1st 2nd 3rd 4th 5th ... n+1th
0 1
1 1
2 1
3 1
... 1 ....
n 1
  • polynomial: base-n, e.g. binary, decimal
binary: n0*20 + n0*21 + n2*22 + ... + nm*2m
decimal: n0*100 + n0*101 + n2*102 + ... + nm*10m


The solution will depend on the nature of the "memory" of the FSM. If it is built from a typical binary Random access memory then the instruction "address" (number) will have to be encoded as binary. The instruction code can be chosen from whatever method is most easy to decode and "fits in the width of the memory".

This is not a trivial problem, and appears in the work of authors who are encoding for purposes of comparing algorithms [example?]. What they tend to do is use e.g. two consecutive 8-bit memory "cells" for every instruction (or e.g. 2 x 16 bit). Thus they might encode an instruction set as follows:

FSM encoding for up to 128 registers and 256 instructions with an 8-bit random-access memory with two consecutive 8-bit bytes:
*0hhhhhhh: the identifier-number/location/identifier of register h expressed in binary (0 through 127). Example: register #5 = 00000101
*xxxxxxxx: the jump-to address expressed in binary (0 through 255)
  • INC h = 00000001, 0hhhhhhh
  • DEC h = 00000010, 0hhhhhhh
  • JZ h,x = 1hhhhhhh, xxxxxxxx
  • HALT = 111----- are both expressed in binary

Thus the first byte always specifies the instruction type. And this type specifies where the 7-bit register number hhhhhhh and the jump-to address will be. But such coding complicates the finite state machine -- it must be smart enough to know (i) interleave and (ii) that if 011hhhhh occurs then the next instruction will be a jump-to address. And if a mistake is made, the machine can begin to "execute" operands instead of instructions.

[edit] References

References can be found at Algorithm