Henry S. Coelho

Loop in assembly for x86 64 and Aarch64

In this post, I will show a loop in assembly for x86 64 and Aarch64, as well as highlighting some differences for both platforms.

The program will print a message with the number of the index every time it iterates. The leading zeroes are truncated for the first 10 iterations. Here is a sample of the output:

Loop 0
Loop 1
Loop 2
...
Loop 10
Loop 11
Loop 12
...

Full source codes

Below are the source codes for the programs

x86 64

.text

.global _start

start = 0
max = 31
sto = 1

_start:
    /* holds loop counter */
    mov    $start,%r15

loop:
    /* getting the quotient and remainder */
    /* into registers 11 (left) and 12 (right) */
    mov    $48,%r11
    mov    $48,%r12
    mov    $0,%rdx
    mov    $10,%r13
    mov    %r15,%rax
    div    %r13
    add    %rdx,%r12
    add    %rax,%r11

    /* shifting characters if < 10 */
    cmp    $9,%r15
    jg     continue
    mov    %r12,%r11
    mov    $32,%r12

continue:
    /* recording the numbers in the string */
    mov    %r11b,pld
    mov    %r12b,prd

    /* printing the string */
    mov    $len,%rdx
    mov    $msg,%rsi
    mov    $sto,%rdi
    mov    $1,%rax
    syscall

    /* incrementing and looping */
    inc    %r15
    cmp    $max,%r15
    jne    loop

    /* exit 0 */
    mov    $0,%rdi
    mov    $60,%rax
    syscall

.data
msg: .ascii "Loop:   
"
.set len, . - msg
.set pld, msg + 6 /* address of left digit */
.set prd, msg + 7 /* address of right digit */

Aarch64

.text

.global _start

start = 0
max = 31
sto = 1

_start:
    /* holds loop counter */
    mov    x9, start

loop:
    /* getting the quotient and remainder */
    /* into registers 11 (left) and 12 (right) */
    mov    x13,10
    udiv   x11,x9,x13
    msub   x12,x11,x13,x9
    add    x11,x11,48
    add    x12,x12,48

    /* shifting characters if < 10 */
    cmp    x9,9
    b.gt   continue
    mov    x11,x12
    mov    x12,32

continue:
    /* recording the numbers in the string */
    adr    x13,msg
    strb   w11,[x13,6]
    strb   w12,[x13,7]

    /* printing the string */
    mov    x2,len
    adr    x1,msg
    mov    x0,sto
    mov    x8,64
    svc    0

    /* incrementing and looping */
    add    x9,x9,1
    mov    x10,max
    cmp    x9,x10
    b.ne   loop

    /* exit 0 */
    mov    x0,0
    mov    x8,93
    svc    0

.data
msg: .ascii "Loop:   
"
.set len, . - msg

Differences

The general structure of both applications is pretty much the same: we have the same directives, and the syntax is very similar. However, there are big differences for the names of the registers and instructions.

A notable difference is the order of the symbols and expressions for the instructions. For example, on x86 64, an add instruction would save the result on the register on the right, while on aarch64, it will save on the register on the left. Aarch64 will also allow us to save the result of an operation in another register, while x86 64 will save it "in place":

x86 64: Will add r10 to r11 and save the result to r11
add %r10,%r11

aarch64: Will add r1 to r2 and save the result to r0
add r0,r1,r2

The system calls, despite the differences in instructions in registers, still take the same arguments (since they are dependent on the operating system):

x86 64
    mov    $len,%rdx
    mov    $msg,%rsi
    mov    $sto,%rdi
    mov    $1,%rax
    syscall

aarch64
    mov    x2,len
    adr    x1,msg
    mov    x0,sto
    mov    x8,64
    svc    0

Some functionalities are also absent from a platform, while present in other. The div instruction for x86 64, for example, is capable of recording the remainder in the register rdx. This does not happen for aarch64. To get the remainder on aarch64, we need to use the instruction msub.

To me, the overall picture of both platforms seem to be the same. Yes, they have different set of instructions, but the instructions do the same thing: math, moving values, system calls, jumping, etc. There are differences, but to me, they are very superficial.

That said, I think I enjoyed programming for x86 64 a bit more - the instructions were simpler, and I could memorize them easily. However, I only made a little loop, so I would not say this opinion is based on lots of experience. Picking a side at this point would not be very wise.

Experience with programming in assembly

The idea of programming in a language like Assembly always fascinated me. Yes, debugging at this level is painful, and you need 50 lines of code just to make a loop that could be done with 3 lines in C, but writing the instructions that are going to be sent directly (or very close to) the processor is way too interesting to be hated. Of course, this was just a simple loop, and I would probably hate to write an operating system in assembly, but working with relatively small snippets of code is like solving a puzzle, and I enjoy puzzles.