Talk:Three address code

From Wikipedia, the free encyclopedia

[edit] Example

The current example:

      i := 0                  ; assignment
L1:   if i < 10 goto L2       ; conditional jump
      goto L3                 ; unconditional jump
L2:   t0 := i*i
      t1 := &b                ; address-of operation
      t2 := t1 + i            ; t2 holds the address of b[i]
      *t2 := t0               ; store through pointer
      i := i + 1
      goto L1
L3:

Two areas of possible contention:

  • Line 5 explicitly computes the "address of" the array b, even though the array-name itself would suffice in C, because lvalues of type int[10] decay to int* in contexts such as this. I would prefer to keep the explicit address calculation, because after all TAC is not C, and we shouldn't assume that our TAC interpreter (or code generator, or whatever) follows the same rules as C when it comes to lvalue decay. The same argument would apply if we were converting an int to a double; the conversion should probably be explicit in the TAC representation. Also consider that readers of this article should not be expected to know the finer points of C. More pedagogical is better.
  • Line 6 does not multiply i by sizeof(int) before adding it to t1. I believe this is reasonable, for two reasons. One: pedagogical clarity. Readers should again not need to know the meaning of the sizeof operator in C. Adding a multiply-by-4 would also lengthen the example by one line, without contributing any new concepts. Two: real-world practicality. All common machines are word-addressible, so if this were the IR for a real compiler, there wouldn't be any "multiply" visible in the generated code. If we're obsessing over technical correctness, we can make the same argument that I rejected in line 5: in C, addition to a pointer is scaled by the size of the pointed-to element.*

I like the new example better than the old one; thanks, Ceroklis!

* - I would not object if the expression &b were changed to &b[0] for total type-correctness, but that's a pretty formidable-looking expression for something that's supposed to look simple. :) --Quuxplusone 04:33, 13 June 2007 (UTC)


I agree with your first point. TAC is not C and making explicit the fact that we are dealing with an address is clearer (even though b by itself would have no meaning).

However I cannot agree with your second point. The expression b[i] is an abbreviation for *(b + i) (pointer addition) which itself is an abbreviation for *(b + sizeof(int) * i) (integer addition). And this is exactly what a compiler will produce at this stage. More specifically:

- The fact that you have to take the size of array elements into account when computing their address is not particularly difficult to understand.

- The meaning of sizeof could be given in a comment, or an equivalent notation could be used.

- The address calculation should be explicit. As per you first point no C abbreviations should be used in TAC.

- You cannot assume that array elements will always be word sized (the current example would break with a char array), and interpreting addresses in TAC as referring to words instead of bytes can only lead to confusion and is fundamentally wrong (see my last point).

- Pedagogical value should not be at the expense of correctness.

- If using arrays renders the example too complicated (which I dispute) we may want to do without them.

- This change adds only one line to the example, which is not much. I also think that examples should be informative as a whole, and that maximizing the number of different constructs per line is not essential. After all the i := i+1 line doesn't bring much either.

- I don't know where you got the idea that modern machines are word addressible. The memory controller only deal with byte addresses, not word addresses. Sure you can avoid doing the multiplication explicitly on machines supporting indexed addressing but this is an optimization that is deferred until machine code generation and doesn't appear in the TAC, which will use the correct full address (which by your convention is a multiple of four).

- I agree that there is a risk for the reader to get lost in the technicalities of C, and that we should avoid it. But C is technical, and definitely not a beginner's language (although it is often wrongly taught to them), which is why it is arguably not a good language to write examples of general concepts in.

That's all, folks! Ceroklis 21:08, 13 June 2007 (UTC)

"I don't know where you got the idea that modern machines are word addressible." — All the machines with which I'm even vaguely familiar (PowerPC, ARM, x86, MIPS) are word-addressible. They also provide instructions to access the individual bytes and half-words of a word, and some of them even allow you to access misaligned word-sized objects.
If you feel that "pedagogical value should not be at the expense of correctness", then yet another solution would be to change the type of b to unsigned char[10], so that i wouldn't need to be scaled. Personally, I think a "technique of deliberate lying" (as Knuth puts it[1]) is a helpful teaching aid now and then. If the technical details aren't important, it's okay to leave them out at first and fill in the gaps later. IMHO, pretty much all of Wikipedia falls into the "at first" category. --Quuxplusone 03:17, 14 June 2007 (UTC)