Henry S. Coelho

SPO600 Lab 4 - Multiarch

This is the second part of my Lab 4, for the course SPO600. In this post, I am going to briefly explain the Multiarch mechanism, and how libc ueses it.

The problem of having different architectures and operating systems means that we can not always use the same source code for every software: if they were built for a specific platform, they will probably not work on others. This is usually not a problem for high level languages such as Java: if you have a JVM installed, it will run; but it is a problem if your code is so low-level that a different CPU will render it useless, for example: assembly code.

So, one solution is to make a source code specific for every machine. This is not a good solution though, because since there are dozens, hundreds, or thousands of different platforms, this could become a mess very quickly. A better solution is to bundle platform-specific files with your source code, and make them be selected as the program is being built. This is what multiarch does: with multiarch, we can build the software from a single source code, but aimed for different platforms, with cross dependencies.

LibC is a multiarch library, designed to be portable for many different machines. This is how it is done:

The algorithm that picks the right files gathers some information about our system: base operating system, manufacturer, CPU type, and operating system, in this order. The algorithm joins this information as a directory hierarchy. For example, if we are using Linux, the base operating system will be unix/sysv, if our machine is described as i686-linux-gnu’ the directory hierarchy will be unix/sysv/linux/i386/i686. The algorithm then tries to find the required file there. If the file is not found, it jumps back one directory and tries again. It also tries removing trailing periods from the name, in order to test less specific version numbers.

For LibC, all these platform specific files are located in a directory called sysdeps. This directory is located in the top level of the source directory.

Inside this directory, this is what we have:

$ ls
aarch64  gnu   ieee754     microblaze  posix    sh     wordsize-32
alpha    hppa  init_array  mips        powerpc  sparc  wordsize-64
arm      i386  m68k        nios2       pthread  tile   x86
generic  ia64  mach        nptl        s390     unix   x86_64

Each one of these entries is a base operating system. I dived a little bit in these directories, and I found two implementations of strlen for two different platforms: PowerPC and ARM. These are their paths:

These files contain the assembly code for these functions, for both platforms. If you want to see their contents, here they are is:

PowerPC

/* Optimized strlen implementation for PowerPC64/POWER7 using cmpb insn.
   Copyright (C) 2010-2017 Free Software Foundation, Inc.
   Contributed by Luis Machado <luisgpm@br.ibm.com>.
   This file is part of the GNU C Library.

   The GNU C Library is free software; you can redistribute it and/or
   modify it under the terms of the GNU Lesser General Public
   License as published by the Free Software Foundation; either
   version 2.1 of the License, or (at your option) any later version.

   The GNU C Library is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
   Lesser General Public License for more details.

   You should have received a copy of the GNU Lesser General Public
   License along with the GNU C Library; if not, see
   <http://www.gnu.org/licenses/>.  */

#include <sysdep.h>

/* int [r3] strlen (char *s [r3])  */

#ifndef STRLEN
# define STRLEN strlen
#endif
    .machine  power7
ENTRY_TOCLESS (STRLEN)
    CALL_MCOUNT 1
    dcbt        0,r3
    clrrdi      r4,r3,3       /* Align the address to doubleword boundary.  */
    rlwinm      r6,r3,3,26,28 /* Calculate padding.  */
    li          r0,0          /* Doubleword with null chars to use with cmpb.  */
    li          r5,-1         /* MASK = 0xffffffffffffffff.  */
    ld          r12,0(r4)     /* Load doubleword from memory.  */
#ifdef __LITTLE_ENDIAN__
    sld         r5,r5,r6
#else
    srd         r5,r5,r6      /* MASK = MASK >> padding.  */
#endif
    orc         r9,r12,r5     /* Mask bits that are not part of the string.  */
    cmpb        r10,r9,r0     /* Check for null bytes in DWORD1.  */
    cmpdi       cr7,r10,0     /* If r10 == 0, no null's have been found.  */
    bne         cr7,L(done)

    mtcrf       0x01,r4

    /* Are we now aligned to a quadword boundary?  If so, skip to
       the main loop.  Otherwise, go through the alignment code.  */

    bt          28,L(loop)

    /* Handle DWORD2 of pair.  */
    ldu         r12,8(r4)
    cmpb        r10,r12,r0
    cmpdi       cr7,r10,0
    bne         cr7,L(done)

    /* Main loop to look for the end of the string.  Since it's a
       small loop (< 8 instructions), align it to 32-bytes.  */
    .p2align    5
L(loop):
    /* Load two doublewords, compare and merge in a
       single register for speed.  This is an attempt
       to speed up the null-checking process for bigger strings.  */

    ld          r12, 8(r4)
    ldu         r11, 16(r4)
    cmpb        r10,r12,r0
    cmpb        r9,r11,r0
    or          r8,r9,r10     /* Merge everything in one doubleword.  */
    cmpdi       cr7,r8,0
    beq         cr7,L(loop)

    /* OK, one (or both) of the doublewords contains a null byte.  Check
       the first doubleword and decrement the address in case the first
       doubleword really contains a null byte.  */

    cmpdi       cr6,r10,0
    addi        r4,r4,-8
    bne         cr6,L(done)

    /* The null byte must be in the second doubleword.  Adjust the address
       again and move the result of cmpb to r10 so we can calculate the
       length.  */

    mr          r10,r9
    addi        r4,r4,8

    /* r10 has the output of the cmpb instruction, that is, it contains
       0xff in the same position as the null byte in the original
       doubleword from the string.  Use that to calculate the length.  */
L(done):
#ifdef __LITTLE_ENDIAN__
    addi        r9, r10, -1   /* Form a mask from trailing zeros.  */
    andc        r9, r9, r10
    popcntd     r0, r9        /* Count the bits in the mask.  */
#else
    cntlzd      r0,r10        /* Count leading zeros before the match.  */
#endif
    subf        r5,r3,r4
    srdi        r0,r0,3       /* Convert leading/trailing zeros to bytes.  */
    add         r3,r5,r0      /* Compute final length.  */
    blr
END (STRLEN)
libc_hidden_builtin_def (strlen)

ARM

/* Copyright (C) 1998-2017 Free Software Foundation, Inc.
   This file is part of the GNU C Library.
   Code contributed by Matthew Wilcox <willy@odie.barnet.ac.uk>

   The GNU C Library is free software; you can redistribute it and/or
   modify it under the terms of the GNU Lesser General Public
   License as published by the Free Software Foundation; either
   version 2.1 of the License, or (at your option) any later version.

   The GNU C Library is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
   Lesser General Public License for more details.

   You should have received a copy of the GNU Lesser General Public
   License along with the GNU C Library.  If not, see
   <http://www.gnu.org/licenses/>.  */

/* Thumb requires excessive IT insns here.  */
#define NO_THUMB
#include <sysdep.h>

/* size_t strlen(const char *S)
 * entry: r0 -> string
 * exit: r0 = len
 */

    .syntax unified
    .text

ENTRY(strlen)
    bic     r1, r0, $3              @ addr of word containing first byte
    ldr     r2, [r1], $4            @ get the first word
    ands    r3, r0, $3              @ how many bytes are duff?
    rsb     r0, r3, $0              @ get - that number into counter.
    beq     Laligned                @ skip into main check routine if no
                    @ more
#ifdef __ARMEB__
    orr     r2, r2, $0xff000000     @ set this byte to non-zero
    subs    r3, r3, $1              @ any more to do?
    orrgt   r2, r2, $0x00ff0000     @ if so, set this byte
    subs    r3, r3, $1              @ more?
    orrgt   r2, r2, $0x0000ff00     @ then set.
#else
    orr     r2, r2, $0x000000ff     @ set this byte to non-zero
    subs    r3, r3, $1              @ any more to do?
    orrgt   r2, r2, $0x0000ff00     @ if so, set this byte
    subs    r3, r3, $1              @ more?
    orrgt   r2, r2, $0x00ff0000     @ then set.
#endif
Laligned:                @ here, we have a word in r2.  Does it
    tst     r2, $0x000000ff         @ contain any zeroes?
    tstne   r2, $0x0000ff00         @
    tstne   r2, $0x00ff0000         @
    tstne   r2, $0xff000000         @
    addne   r0, r0, $4              @ if not, the string is 4 bytes longer
    ldrne   r2, [r1], $4            @ and we continue to the next word
    bne     Laligned                @
Llastword:                @ drop through to here once we find a
#ifdef __ARMEB__
    tst     r2, $0xff000000         @ word that has a zero byte in it
    addne   r0, r0, $1              @
    tstne   r2, $0x00ff0000         @ and add up to 3 bytes on to it
    addne   r0, r0, $1              @
    tstne   r2, $0x0000ff00         @ (if first three all non-zero, 4th
    addne   r0, r0, $1              @  must be zero)
#else
    tst     r2, $0x000000ff         @ word that has a zero byte in it
    addne   r0, r0, $1              @
    tstne   r2, $0x0000ff00         @ and add up to 3 bytes on to it
    addne   r0, r0, $1              @
    tstne   r2, $0x00ff0000         @ (if first three all non-zero, 4th
    addne   r0, r0, $1              @  must be zero)
#endif
    DO_RET(lr)
END(strlen)
libc_hidden_builtin_def (strlen)