Index checking

From Wikipedia, the free encyclopedia

In computer programming, much use is made of simple variables given names such as X, I, Enough, etc. A compiler, in generating the machine code will have some scheme for assigning computer storage locations to hold the values of such variables and then, whenever there is mention of a name, the compiler will generate code that refers to that address and no human need remember the various addresses of which there may be hundreds to confuse. Instead, one works with well-chosen names as an aide to memory.

As programmes become more complex, there soon arise circumstances in which one might create a collection of similarly-named variables, and much the same is being done with each. Suppose that you find yourself dealing with X1, X2, X3; Y1, Y2, Y3; and A1, A2, A3 somewhat as follows:

     A1:=X1 + Y1;
     A2:=X2 + Y2;
     A3:=Y3 + X3;

Repetition is boring, and error-prone. A formalism is introduced, indexing, whereby an array of elements is defined with individual elements being selected by an index, usually an integer. In the above example, there would be three arrays, X, Y, A, each declared to have three elements and the code might become

     A(1):=X(1) + Y(1);
     A(2):=X(2) + Y(2);
     A(3):=Y(3) + X(3);

Which is of course even more typing, but there will be alternatives, such as

     for i:=1:3 do
      A(i):=X(i) + Y(i);
     next i;

Which performs all three additions in the same order (X + Y), acceptable because Y(3) + X(3) = X(3) + Y(3).

Index checking means that in all expressions indexing an array, first check the index value against the bounds of the array which were established when the array was defined, and should an index be out of bounds, further execution is suspended via some sort of error.

In the example it would suffice to check the value of i once (because all the arrays have the same bounds), and better still would be to note that its values are determined by the loop control and check that its values are within bounds. In this case they are, and, being constant, should be checked by the compiler so that no check need be performed when the programme is run. Alas, compilers are likely to generate code to check the bounds for every usage of the index variable, bloating the code and slowing the run. Alternatively, you could be fortunate enough to be dealing with a system such as the Burroughs 6700 where all indexing was checked automatically by the hardware since every INDX operation worked off a memory pointer that contained both a start point and a length.

Although a loop such as the above is simple, mistakes are possible, especially when the programme is to be altered. Perhaps the array size is to be changed so that there are only two elements. If so, and the loop statement is not altered correspondingly, referring to X(3) will access storage belonging to something else, and storing into A(3) will overwrite storage assigned to something else with arbitrary consequences. For instance, the location corresponding to A(3) might be where the value 3 is stored, and replacing that value with some bit pattern that comes out as (say) 31415 might mean that the loop continues, thus overwriting a large area of memory with surely unhelpful values.

A very useful technique is to "parameterise" your programme: instead of literal constants such as 3 scattered in many locations (where the array is defined, where it is used), a helpful computer language will enable you to define a name such as Nelements which has a constant value 3, and which you use wherever there might be a 3 pertaining to that array. Other occurrences of 3 might pertain to other matters entirely: a good choice of such names is helpful. Any subsequent need to change the size of the arrays requires merely adjusting the constant value specified, though it is still up to you to be sure that your programme will behave properly for the different size.

Better still is to be blessed with a computer language which offers abilities to manipulate arrays without nitpicking detail to remember, as in

     A:=X + Y;

Where the compiler generates the necessary loops, equipped of course with certain knowledge of the sizes of all the arrays, and if the sizes are incompatible it will generate an error message.

Attending to these petty details is a simple clerical task. Computers excel at simple clerical tasks. Why not let the computer do it?