blog

SPO600 Lab 2 - Comparing different compilation options

This lab will compare the code generated by the GCC compiler with different sets of parameters for the compiler, as well as small different changes in the source code. The requirements for this lab, as well as the differences in the files/compilations can be found here ### Change 1 File size change: The file size went from 25kb to 10Mb Change in headers: By running the command "file" against the generated files, we see that the first one is "Dynamically linked", while the second one is "Statically linked" Function call: the <main> function in the second case calls a subroutine called **\_IO\_printf**, which is located inside the file. In the original file, the function called is **printf@plt**, which jumps to a subroutine called **printf@GLIBC_2.2.5** and is not inside the file. ### Change 2 Without the -fno-builtin parameter, the function called was not "printf", but "puts". ### Change 3 Without the -g compiler option, the file generated does not have the flag "with debug_info" in the headers. It is also slight smaller in size. There is a noticeable difference in the disassembly output: in the original file, we can see the lines of the main program, as well as the assembly code. ``` 000000000000068a
: #include int main() { 68a: 55 push %rbp 68b: 48 89 e5 mov %rsp,%rbp printf("Hello World!\n"); 68e: 48 8d 3d 9f 00 00 00 lea 0x9f(%rip),%rdi # 734 <_IO_stdin_used+0x4> 695: b8 00 00 00 00 mov $0x0,%eax 69a: e8 c1 fe ff ff callq 560 69f: b8 00 00 00 00 mov $0x0,%eax } 6a4: 5d pop %rbp 6a5: c3 retq 6a6: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1) 6ad: 00 00 00 ``` While the other file only has the assembly code: ``` 000000000000068a
: 68a: 55 push %rbp 68b: 48 89 e5 mov %rsp,%rbp 68e: 48 8d 3d 9f 00 00 00 lea 0x9f(%rip),%rdi # 734 <_IO_stdin_used+0x4> 695: b8 00 00 00 00 mov $0x0,%eax 69a: e8 c1 fe ff ff callq 560 69f: b8 00 00 00 00 mov $0x0,%eax 6a4: 5d pop %rbp 6a5: c3 retq 6a6: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1) 6ad: 00 00 00 ``` ### Change 4 My CPU is a x86_64, so you can take a look at the registers for my machine in this link. The first argument was pushed to the **esi** register; the second and third were pushed into the **edx** and **ecx** register; the fourth and fifth were pushed into register **8** and **9**; and the following registers were pushed directly into the stack with **pushq**. This behaviour confirms what was described in the page I linked: *First six arguments are in rdi, rsi, rdx, rcx, r8d, r9d; remaining arguments are on the stack.* ``` 68e: 48 83 ec 08 sub $0x8,%rsp 692: 6a 00 pushq $0x0 694: 6a 09 pushq $0x9 696: 6a 08 pushq $0x8 698: 6a 07 pushq $0x7 69a: 6a 06 pushq $0x6 69c: 41 b9 05 00 00 00 mov $0x5,%r9d 6a2: 41 b8 04 00 00 00 mov $0x4,%r8d 6a8: b9 03 00 00 00 mov $0x3,%ecx 6ad: ba 02 00 00 00 mov $0x2,%edx 6b2: be 01 00 00 00 mov $0x1,%esi ``` After some research, I found out that because items are pushed into the stack, the **rsp** is changed in multiples of 8. Because we pushed 5 values, it was changed by 8. ### Change 5 The changes in the disassembly are very simple. In the *<main>* section, instead of a call to the *printf* function, we have a call to the *output* function: ``` --- Original: 68b: 48 89 e5 mov %rsp,%rbp printf("Hello World! %d\n", 68e: be 01 00 00 00 mov $0x1,%esi --- Changed: 6a3: 48 89 e5 mov %rsp,%rbp output(); 6a6: b8 00 00 00 00 mov $0x0,%eax ``` Additionally, we have a subroutine called *<output>*: ``` 000000000000068a : #include void output() { 68a: 55 push %rbp 68b: 48 89 e5 mov %rsp,%rbp printf("Hello World!\n"); 68e: 48 8d 3d af 00 00 00 lea 0xaf(%rip),%rdi # 744 <_IO_stdin_used+0x4> 695: b8 00 00 00 00 mov $0x0,%eax 69a: e8 c1 fe ff ff callq 560 } 69f: 90 nop 6a0: 5d pop %rbp 6a1: c3 retq ``` ### Change 6 The total size of the file is slightly larger, but according to the documentation, the **O3** option is a tradeoff between file size and speed: we may end up with a file that is larger, but it will be faster. One interesting change in the main function is this line: ``` 695: b8 00 00 00 00 mov $0x0,%eax ``` Being replaced with this line; ``` 58b: 31 c0 xor %eax,%eax ``` The XOR pattern is faster than MOV, and in these cases, they do the same thing: The first case erases everything in register %eax (replaces it with 0) by moving the number 0 into it. The second case runs the value against itself on a XOR gate - this also results in the register being cleaned. This is why: This is the output of a XOR gate: ``` A B Output 0 0 0 1 0 1 0 1 1 1 1 0 ``` Basically: if the values are the same, the output bit is 0. Suppose we have the value 01001 in the register. If we run this value against itself in a XOR gate, this is what we get: ``` 01001 01001 ----- 00000 ``` This is very fast way to clear a register.