Strength reduction
From Wikipedia, the free encyclopedia
Strength reduction is a compiler optimization where a function of some systematically changing variable is calculated more efficiently by using previous values of the function. In a procedural programming language this would apply to an expression involving a loop variable and in a declarative language it would apply to the argument of a recursive function. E.g.
- f x = ... (2**x) ... (f (x+1)) ...
becomes
- f x = f' x (2**x)
where
- f ' x z = ... z ... (f' (x+1) 2*z) ...
Here the expensive operation (2**x) has been replaced by the cheaper 2*z in the recursive function f'. This maintains the invariant that z = 2**x for any call to f'.
[edit] Strength reduction in loop address computation
One of the most important uses of the strength reduction is computing memory addresses inside a loop. For example a naive translation of the following loop would multiply y by 320 and add it to x in each iteration of the inner loop:
/* Initialize 320x200 array to zeros */ void foo(int *img) { for(int y=0; y<320; y++) for(int x=0; x<200; x++) img[y*320 + x] = 0; }
This is indeed the case with GCC with optimizations turned off:
.LFB3: pushl %ebp .LCFI0: movl %esp, %ebp .LCFI1: subl $8, %esp .LCFI2: movl $0, -4(%ebp) .L2: cmpl $319, -4(%ebp) jle .L5 jmp .L1 .L5: movl $0, -8(%ebp) .L6: cmpl $199, -8(%ebp) jle .L9 jmp .L4 # .L9 is the inner loop iteration, the code takes y (-4(%ebp)) and x (-8(%ebp)) # and computes the index based on them .L9: movl -4(%ebp), %edx movl %edx, %eax sall $2, %eax addl %edx, %eax sall $6, %eax addl -8(%ebp), %eax leal 0(,%eax,4), %edx movl 8(%ebp), %eax movl $0, (%eax,%edx) leal -8(%ebp), %eax incl (%eax) jmp .L6 .L4: leal -4(%ebp), %eax incl (%eax) jmp .L2 .L1: leave ret
With optimizations turned on the code gets compiled to much more efficient form:
.LFB3: pushl %ebp .LCFI0: xorl %ecx, %ecx movl %esp, %ebp .LCFI1: pushl %esi .LCFI2: movl 8(%ebp), %esi pushl %ebx .LCFI3: xorl %ebx, %ebx .L11: leal (%esi,%ecx), %eax movl $199, %edx .p2align 4,,15 # Inner loop is just the next four instructions # It computes address by adding 4 to the previous one (1 times size of integer) .L10: movl $0, (%eax) addl $4, %eax decl %edx jns .L10 incl %ebx # And to compute the address in the outer loop it doesn't use a multiplication, # but adds 1280 (320 * size of integer) to the previous address addl $1280, %ecx cmpl $319, %ebx jle .L11 popl %ebx popl %esi popl %ebp ret
This article was originally based on material from the Free On-line Dictionary of Computing, which is licensed under the GFDL.