Compiled with –Ospace, –O3
Compiled with –Ospace, –O3, --thumb
Execution Time
(microseconds)
2.467
3.050
Program Size*
(bytes)
134
104
bar main
*Program size reflects object file that includes main and all associated data, but not startup code.
B STR BL MOV LDR BX
foo r14, [r13, #-0x0004]! bar r0, #0x00000000 r14, [r13], #0x0004 r14
Table 3: Checksum simulation comparisons using ––thumb.
Interprocedural optimizations [ 10] differ from other compiler optimizations in that they are performed by analyzing the entire program or file as opposed to individual functions or blocks of code. One example that we have already mentioned is inline function expansion [ 8]. Other general examples include removing non-executing and/or redundant code (also known as “dead code elimination”), simplifying loops, and using memory more efficiently.
Many interprocedural optimizations have to do with optimizing function calls. A function call can be considered to be a “tail-call” if it is the last meaningful thing done by the calling function (other than the function’s return instruction). Tail-call optimization can eliminate unnecessary returns and stack accesses in function hierarchies. Consider the following C code:
int foo() {
return 1;}int bar() {
int x = foo();
return x; }The first noticeable difference between the outputs is that the compiler will sometimes do some insignificant and wasteful things at -O0 according to the procedure call standard mentioned before. Understanding this is not important here, but more information can be found by exploring the standard [ 1]. The tail-call optimization attempts to replace branches that require return address overhead with direct branches that do not need to return. Examining the first output shows that the call to foo in bar requires that the return address register (r14) be first pushed on the stack so that the BL instruction can safely overwrite it. This is not needed with a direct branch, which can be used because there is no need to return to bar from foo. All of this save stack and code space and decrease execution time. Also of note are the stack instructions using r13 (which is the register defined by the procedure call standard as the stack pointer) and the “!” signifier. The “!” indicates that the stack pointer will be updated with the new value used to calculate the address accessed.
The results of the simulation can be seen in Table 4.
The function foo still returns normally so that other functions can call it and return correctly. The transformations done by the tail-call optimization only apply to the calling function. Again, the other differences seen are the result of the compiler adhering to the procedure call standard.
int main() {
int y = bar();
return 0; }
Tail-call optimization is only done by the ARM C compiler when using -O1 or higher. When compiling this with -O0, the compiler produces the following assembly:
foo
bar
main
MOV BX STR BL MOV MOV LDR BX STR BL MOV MOV LDR BX
r0, #0x0000001 r14 r14, [r13, #-0x0004]! foo r1, r0 r0, r1 r14, [r13], #0x0004 r14 r14, [r13, #-0x0004]! bar r2, r0 r0, #0x00000000 r14, [r13], #0x0004 r14
Compare this with the output using -O1:
foo
MOV BX
r0, #0x0000001 r14
Compiled with –O0 Compiled with –O1
Execution Time
(microseconds)
.217
.050
Program Size*
(bytes)
56 32
*Program size reflects object file that includes main and all associated data, but not startup code.
Table 4: Bar simulation comparisons.
It is important to find the best combination of options for a specific application during the development process. The following methodology is one way to do this and similar approaches can be taken. The first step is to analyze the goals of the option selection process and decide which criteria are most important to meet these goals. Some criteria include:
Usually measured in MIPS or microseconds, it is the speed that the processor can execute instructions and is dependent upon a number of hardware and software factors.
The application code size is the amount of memory required to hold the entire application code, including all associated data and
References:
Archives