blog

SPO600 Lab 4 - Multiarch

This is the second part of my Lab 4, for the course SPO600. In this post, I am going to briefly explain the Multiarch mechanism, and how **libc** ueses it. The problem of having different architectures and operating systems means that we can not always use the same source code for every software: if they were built for a specific platform, they will probably not work on others. This is usually not a problem for high level languages such as Java: if you have a JVM installed, it will run; but it is a problem if your code is so low-level that a different CPU will render it useless, for example: assembly code. So, one solution is to make a source code specific for every machine. This is not a good solution though, because since there are dozens, hundreds, or thousands of different platforms, this could become a mess very quickly. A better solution is to bundle platform-specific files with your source code, and make them be selected as the program is being built. This is what *multiarch* does: with multiarch, we can build the software from a single source code, but aimed for different platforms, with cross dependencies. LibC is a multiarch library, designed to be portable for many different machines. This is how it is done: The algorithm that picks the right files gathers some information about our system: base operating system, manufacturer, CPU type, and operating system, in this order. The algorithm joins this information as a directory hierarchy. For example, if we are using Linux, the base operating system will be `unix/sysv`, if our machine is described as i686-linux-gnu’ the directory hierarchy will be unix/sysv/linux/i386/i686. The algorithm then tries to find the required file there. If the file is not found, it jumps back one directory and tries again. It also tries removing trailing periods from the name, in order to test less specific version numbers. For LibC, all these platform specific files are located in a directory called `sysdeps`. This directory is located in the top level of the source directory. Inside this directory, this is what we have: ``` $ ls aarch64 gnu ieee754 microblaze posix sh wordsize-32 alpha hppa init_array mips powerpc sparc wordsize-64 arm i386 m68k nios2 pthread tile x86 generic ia64 mach nptl s390 unix x86_64 ``` Each one of these entries is a base operating system. I dived a little bit in these directories, and I found two implementations of **strlen** for two different platforms: PowerPC and ARM. These are their paths: - sysdeps/powerpc/powerpc64/power7/strlen.S - sysdeps/arm/strlen.S These files contain the assembly code for these functions, for both platforms. If you want to see their contents, here they are is: # PowerPC ``` /* Optimized strlen implementation for PowerPC64/POWER7 using cmpb insn. Copyright (C) 2010-2017 Free Software Foundation, Inc. Contributed by Luis Machado . This file is part of the GNU C Library. The GNU C Library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version. The GNU C Library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details. You should have received a copy of the GNU Lesser General Public License along with the GNU C Library; if not, see . */ #include /* int [r3] strlen (char *s [r3]) */ #ifndef STRLEN # define STRLEN strlen #endif .machine power7 ENTRY_TOCLESS (STRLEN) CALL_MCOUNT 1 dcbt 0,r3 clrrdi r4,r3,3 /* Align the address to doubleword boundary. */ rlwinm r6,r3,3,26,28 /* Calculate padding. */ li r0,0 /* Doubleword with null chars to use with cmpb. */ li r5,-1 /* MASK = 0xffffffffffffffff. */ ld r12,0(r4) /* Load doubleword from memory. */ #ifdef __LITTLE_ENDIAN__ sld r5,r5,r6 #else srd r5,r5,r6 /* MASK = MASK >> padding. */ #endif orc r9,r12,r5 /* Mask bits that are not part of the string. */ cmpb r10,r9,r0 /* Check for null bytes in DWORD1. */ cmpdi cr7,r10,0 /* If r10 == 0, no null's have been found. */ bne cr7,L(done) mtcrf 0x01,r4 /* Are we now aligned to a quadword boundary? If so, skip to the main loop. Otherwise, go through the alignment code. */ bt 28,L(loop) /* Handle DWORD2 of pair. */ ldu r12,8(r4) cmpb r10,r12,r0 cmpdi cr7,r10,0 bne cr7,L(done) /* Main loop to look for the end of the string. Since it's a small loop (< 8 instructions), align it to 32-bytes. */ .p2align 5 L(loop): /* Load two doublewords, compare and merge in a single register for speed. This is an attempt to speed up the null-checking process for bigger strings. */ ld r12, 8(r4) ldu r11, 16(r4) cmpb r10,r12,r0 cmpb r9,r11,r0 or r8,r9,r10 /* Merge everything in one doubleword. */ cmpdi cr7,r8,0 beq cr7,L(loop) /* OK, one (or both) of the doublewords contains a null byte. Check the first doubleword and decrement the address in case the first doubleword really contains a null byte. */ cmpdi cr6,r10,0 addi r4,r4,-8 bne cr6,L(done) /* The null byte must be in the second doubleword. Adjust the address again and move the result of cmpb to r10 so we can calculate the length. */ mr r10,r9 addi r4,r4,8 /* r10 has the output of the cmpb instruction, that is, it contains 0xff in the same position as the null byte in the original doubleword from the string. Use that to calculate the length. */ L(done): #ifdef __LITTLE_ENDIAN__ addi r9, r10, -1 /* Form a mask from trailing zeros. */ andc r9, r9, r10 popcntd r0, r9 /* Count the bits in the mask. */ #else cntlzd r0,r10 /* Count leading zeros before the match. */ #endif subf r5,r3,r4 srdi r0,r0,3 /* Convert leading/trailing zeros to bytes. */ add r3,r5,r0 /* Compute final length. */ blr END (STRLEN) libc_hidden_builtin_def (strlen) ``` # ARM ``` /* Copyright (C) 1998-2017 Free Software Foundation, Inc. This file is part of the GNU C Library. Code contributed by Matthew Wilcox The GNU C Library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version. The GNU C Library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details. You should have received a copy of the GNU Lesser General Public License along with the GNU C Library. If not, see . */ /* Thumb requires excessive IT insns here. */ #define NO_THUMB #include /* size_t strlen(const char *S) * entry: r0 -> string * exit: r0 = len */ .syntax unified .text ENTRY(strlen) bic r1, r0, $3 @ addr of word containing first byte ldr r2, [r1], $4 @ get the first word ands r3, r0, $3 @ how many bytes are duff? rsb r0, r3, $0 @ get - that number into counter. beq Laligned @ skip into main check routine if no @ more #ifdef __ARMEB__ orr r2, r2, $0xff000000 @ set this byte to non-zero subs r3, r3, $1 @ any more to do? orrgt r2, r2, $0x00ff0000 @ if so, set this byte subs r3, r3, $1 @ more? orrgt r2, r2, $0x0000ff00 @ then set. #else orr r2, r2, $0x000000ff @ set this byte to non-zero subs r3, r3, $1 @ any more to do? orrgt r2, r2, $0x0000ff00 @ if so, set this byte subs r3, r3, $1 @ more? orrgt r2, r2, $0x00ff0000 @ then set. #endif Laligned: @ here, we have a word in r2. Does it tst r2, $0x000000ff @ contain any zeroes? tstne r2, $0x0000ff00 @ tstne r2, $0x00ff0000 @ tstne r2, $0xff000000 @ addne r0, r0, $4 @ if not, the string is 4 bytes longer ldrne r2, [r1], $4 @ and we continue to the next word bne Laligned @ Llastword: @ drop through to here once we find a #ifdef __ARMEB__ tst r2, $0xff000000 @ word that has a zero byte in it addne r0, r0, $1 @ tstne r2, $0x00ff0000 @ and add up to 3 bytes on to it addne r0, r0, $1 @ tstne r2, $0x0000ff00 @ (if first three all non-zero, 4th addne r0, r0, $1 @ must be zero) #else tst r2, $0x000000ff @ word that has a zero byte in it addne r0, r0, $1 @ tstne r2, $0x0000ff00 @ and add up to 3 bytes on to it addne r0, r0, $1 @ tstne r2, $0x00ff0000 @ (if first three all non-zero, 4th addne r0, r0, $1 @ must be zero) #endif DO_RET(lr) END(strlen) libc_hidden_builtin_def (strlen) ```