Bounds checking

From Wikipedia, the free encyclopedia

[edit] Computer programming

In computer programming, bounds checking is the name given to any method of detecting whether or not an index given lies within the limits of an array. For example, accessing index 25 on an array of size 10 would be caught by bounds checking as an invalid index, because it does not lie within the specified size limit of the array.

A failed bounds check usually results in the generation of an exception.

Because performing bounds checking during every access of an array is inefficient and in some applications unacceptable, there are compiler technologies for eliminating bounds checking in many common cases when they are probably not needed; see bounds checking elimination.

Many programming languages, such as C, never perform automatic bounds checking, in the interest of efficiency. This has, however, been a source of innumerable errors, especially off-by-one errors and buffer overflows. Although a lack of bounds-checking is necessary in some scenarios, many programmers believe these languages sacrifice too much in their search for rapid execution. In his 1980 Turing Award lecture, C. Antony R. Hoare described his experience in the design of Algol 60, a language that included bounds checking. He said:

A consequence of this principle is that every occurrence of every subscript of every subscripted variable was on every occasion checked at run time against both the upper and the lower declared bounds of the array. Many years later we asked our customers whether they wished us to provide an option to switch off these checks in the interest of efficiency on production runs. Unanimously, they urged us not to - they already knew how frequently subscript errors occur on production runs where failure to detect them could be disastrous. I note with fear and horror that even in 1980, language designers and users have not learned this lesson. In any respectable branch of engineering, failure to observe such elementary precautions would have long been against the law.

Mainstream languages that enforce run time checking include Ada, Visual Basic, Java and C#. The D programming language has run time bounds checking that is enabled or disabled with a compiler switch. C# also supports the notion of an unsafe region, a section of code which (among other things) temporarily suspends bounds checking in the interest of efficiency. This is useful for speeding up small time-critical bottlenecks without sacrificing the safety of the entire program.


[edit] Data Quality

In the context of data collection and data quality, bounds checking refers to checking that the data is not trivially invalid. For example, a percentage measurement must be in the range 0 to 100; the height of an adult person must be in the range 0 to 3 meters.

[edit] Reference

  • "The Emperor's Old Clothes", The 1980 ACM Turing Award Lecture, CACM volume 24 number 2, February 1981, pp 75-83. [1]