Talk:Switch statement
From Wikipedia, the free encyclopedia
Just out of curiosity does anyone actually know the mechanics behind a switch statement? In other words how does a switch statement look in assembly? is it a jump, or a continuous set of conditionals?
I suppose this illustrates what im asking
int foo (int i) {
switch (i) { case 1: return 1; case 2: return 2; ... case n: return n; default: return 0; }
}
look the same as a whole lot of:
if else if else if
or would it jump immediately to the correct code? (i.e. actually switch) if it does switch, could someone explain how this works?
155.232.128.10 10:19, 24 January 2007 (UTC)
- Naturally, it's not dictated by the language. (Or at least not by C). It is certainly correct for a compiler to emit a long chain of if, else if, else if... but it is possible to do much better. For example, the compiler can use the control variable as an index into a jump table. Of course in the example you cite above, the programmer would do much better simply to return n; without using a switch at all. ;) Psyno 02:00, 8 March 2007 (UTC)
- It's a good question, and actually I think the article could use a section on that from an expert. The reason I think it's worth mentioning is that it's possible that it affects the performance of the program. When you are able to use a switch statement, is that better performance-wise than simply using a chain of "if" statements? I think there should be a discussion on that in the article by someone knowledgeable. Even if the answer is no, it is still worth mentioning in the article that the switch statement exists for reasons like code readability and there is no performance difference.
- Mbarbier 05:14, 25 March 2007 (UTC)
- In most BASIC's, a 'case label' is 'evaluated' the same way an IF/ELSE condition is evaluated. Historically, this meant that the 'evaluation' routine was called on the expression. This allows you to put functions or variables as both the switch variable and the case variable: SELECT fn1(x), CASE fn2(y). This may then be extended to allow a range or sequence as the case 'label', and allows dynamic (data dependent) logic expressions, sometimes implemented in a very restricted form in other languages by the keyword "IN", as <IF 'a' IN myset>. In PASCAL, a 'case label' is normally implemented as a mapped constant. This allows the branching logic to be evaluated at compilation. This is equivilant to allowing IF statements only with comparison to a constant: <IF (a==5)>, which is a well-known optimisation. Since all programming languages are Turing Complete, any program can be written using either approach.150.101.166.15 (talk) 03:08, 20 November 2007 (UTC)
- In 2006, I had two examples in branch table showing how the equivalent of a switch statement worked in IBM Assembler. It was deleted by NicM unknown to me. I have just added a comment about the deletion in his Talk page.
Unfortunately the Switch statement suffers from a limitation in ANSI C because you can only 'branch' to labels within the function itself. This can be overcome to some extent by placing labels within the function that then call an external function. Sadly, there is a performance penalty for so doing because of the implied push/pop of stacks required for the function call. My first example showed that only 5 machine instructions were required to branch to a label within 4k of the branch table. The 2nd example showed how this could be extended to within 64K (actually 256k if labels were forced onto word boundaries) whilst at the same time halving the size of the branch table by using two byte offsets instead of actual branch instructions. By using 4 byte offsets (or addresses) extends this to a branch range of 2 gigabytes (or 8 gigabytes if aligned on word boundary). I repeat the deleted examples here:- The much longer example below assumes that the input consists of a random letter of the alphabet between A and Z in V1 and two numeric values V2 and V3. It performs the following steps:
* V1 is checked for for the values "A" (add, index 0), "S" (subtract, index 1), "M" (multiply, index 2) and "D" (divide, index 3); all other inputs are considered invalid. * a branch is made into the branch table to jump to a seperate code section for each operation.
The following is an IBM System/360 example. The instruction length is 4 for the branch instruction.
SR R1,R1 CLEAR REGISTER used for input value. LOOP IC R1,V1 INSERT CHARACTER "V1" into the cleared reg. IC R1,TABLE2(R1) "CONVERT" value (to 0,4,8,12 etc) B TABLE1(R1) GO TO APPROPRIATE SECTION OF PROGRAM * this is where the (4 byte) branch instructions start TABLE1 DS 0H *** BRANCH TABLE *** (Keep in order) B INVALID 0 (Any invalid character EG E,P,X,F)… B ADD 4 GO ADD V2 to V1 B SUBTRACT 8 GO SUBTRACT V2 FROM V1 B DIVIDE 12 GO and DIVIDE V1 BY V2 B MULTIPLY 16 GO and MULTIPLY V1 BY V2 * end of branch table ADD DS 0H *** ADD section of program *** * section of code to perform addition AP V2,V3 perform addition B FINISH end program SUBTRACT DS 0H * *** SUBTRACT section of program *** MULTIPLY DS 0H * *** MULTIPLY section of program *** DIVIDE DS 0H * *** DIVIDE section of program *** INVALID DS 0H *** HERE on error *** * TABLE2 DC 256AL1(0) TABLE of index values (set to "Invalid" = 00) ORG TABLE1+C'A' DC Al1(04) VALUE FOR "Add" to replace "A" ORG TABLE1+C'S' DC Al1(08) VALUE for "Subtract" to replace "S" ORG TABLE+C'D' DC Al1(12) VALUE FOR "Divide" to replace "D" ORG TABLE+C'M' DC Al1(16) VALUE FOR "Multiply" to replace "M" * V1 DS C Input value 1 V2 DS PL6 Input value 2 V3 DS PL2 Input Value 3 END
The above program achieves validation and direct branching to the processing routine without any conditional instructions in just 5 instructions all of which are "fast" instructions.
(The index values in "TABLE2" could have been defined as "self-validating" (at assembly time) and not requiring the branch instructions to be in any particular order but has been shown as above for simplicity).
Similar indexing techniques can be used to handle all manner of efficient processing using coded values initially derived from input streams.
As an alternative to TABLE1 containing actual machine instructions, it could instead have consisted of 2, 3 or 4 byte offsets or absolute addresses of locations in this (or other seperately compiled) program. If the program is small enough, even a one byte offset might suffice for each table entry and represent the offset from the first processing routine to the each of the others (However, a two byte offset will normally provide the best trade off in terms of "expandability" versus table size, permitting 64K of processing code between the first and last entry point.
SR R1,R1 CLEAR REGISTER used for input value. IC R1,V1 INSERT CHARACTER "V1" into the cleared reg IC R1,TABLE1(R1) "CONVERT" value (to 0,2,4,6 etc) ICM R1,3,TABLE3(R1) "CONVERT" value (to offset from Add * B ROUTINES(R1) GO TO APPROPRIATE SECTION OF PROGRAM * this is where the (4 byte) branch instructions start ROUTINES DS 0H *** Start of processing routines *** * Add follows ADD DS 0H *** ADD section of program *** * section of code to perform addition AP V2,V3 perform addition B FINISH end program SUBTRACT DS 0H * *** SUBTRACT section of program *** MULTIPLY DS 0H * *** MULTIPLY section of program *** DIVIDE DS 0H * *** DIVIDE section of program *** MULTIPLY DS 0H * *** MULTIPLY section of program *** INVALID DS 0H *** HERE on error *** * TABLE2 DC 256AL1(0) TABLE of index values (set to "Invalid" = 00) ORG TABLE1+C'A' DC Al1(02) VALUE FOR "Add" to replace "A" ORG TABLE1+C'S' DC Al1(04) VALUE for "Subtract" to replace "S" ORG TABLE+C'D' DC Al1(06) VALUE FOR "Divide" to replace "D" ORG TABLE+C'M' DC Al1(08) VALUE FOR "Multiply" to replace "M" * TABLE3 DC AL2(INVALID-ROUTINES) 00 *offset to invalid routine* DC AL2(ADD-ROUTINES) 02 DC AL2(SUBTRACT-ROUTINES) 04 DC AL2(DIVIDE-ROUTINES) 06 DC AL2(MULTIPLY-ROUTINES) 08 * V1 DS C Input value 1 V2 DS PL6 Input value 2 V3 DS PL2 Input Value 3 END
No extra instruction has been needed here (it still requires just 5 instructions to get to the entry point of the appropriate routine) but the branch table is replaced by a new table (TABLE3) containing two byte offsets that are indexed by the value extracted from TABLE2 (but now containing 0,2,4,6,or 8 instead of 0,4,8,12 or 16 in previous example).
In this example, "TABLE3" is therefore only 50% of the size of the original branch table (TABLE1) although in this case saving a mere 2 x 4 =16 bytes! For a larger range of possible input values requiring a larger number of routines, this saving might be more significant.
[edit] History
The conversion of "raw data" input into coded values stored by the computer was commonplace in the early days of computing but has rapidly declined with the advent of cheap memory, cheap hard discs and cheap processing.
Its advantages are many and efficiencies great. Once a "value" has been converted to an index, it can be used to retrieve the value efficiently from a table or database at any time in the future. For example, there are a relatively few Countries in the world and so can easily be represented by a country code (as used in the international public telephone system). This code can be stored simply as a country number in computer records rather than a text string such as "Central African Republic" resulting in massive savings in storage and processing time. Numeric comparisons are significantly faster than text string comparisons and indexed retrievals significantly faster than string searches or retrieval from a file using a (much longer) "key".
Unfortunately, the Millenium bug has, in part, discredited the coding of raw data. This is a pity since the issue of small real memory is almost redundant but efficient processing is always an advantage. Classifying natural things into a much smaller number of meaningful categories has always been a goal of mankind since the dawn of history. An recent excellent example of this is coding for one of any of 16 million colours using a simple code of just 3 x one byte for each primary colour in range 0-255. " Kdakin3420 (talk) 06:32, 4 February 2008 (UTC)
[edit] Wikicode
Does wiki-language have a switch statement useble within a template definition? I need to convert numbers to words an also I need to convert numbers like 4 to 04. Thanks. O'RyanW (☺ ₪) 23:33, 21 April 2007 (UTC)