Random failures while loading iwmmxt compiled binaries

Rex Ashbaugh rexa at xeratech.com
Tue Jun 22 13:13:11 EDT 2010


Do you have IWMMXT support enabled in your kernel config?

Rex

On Tue, Jun 22, 2010 at 9:15 AM, Enrico Scholz <
enrico.scholz at sigma-chemnitz.de> wrote:

> Hi,
>
> I have the problem that program loading segfaults sometimes or aborts
> with
>
> | Inconsistency detected by ld.so: ../elf/dl-sysdep.c: 465:
> _dl_important_hwcaps: Assertion `m == cnt' f
> ailed!
>
> There are existing similar reports in the glibc[1] or gentoo[2] bugtrackers
> and it happens very seldom during normal usage.
>
> I analyzed it partially and can reproduce it in <10 minutes on PXA270
> and PXA320 platforms, but I am unable to find a solution yet.  These
> platforms were running with kernel 2.6.34 (PXA270) and 2.6.31 (PXA320).
>
>
> Conclusions:
> ============
>
> * it is not a compiler bug
>
> * it is not a eglibc bug
>
> * it can be only the kernel (iwmmxt state not restored properly between
>  task switching?) or a bug in the silicon
>
>
> Steps to reproduce it:
> ======================
>
> 1. Build (e)glibc and busybox with -march=iwmmxt -mcpu=iwmmxt
> -mtune=iwmmxt;
>   I placed binaries at [3] which are based on OpenEmbedded's gcc 4.3.4 and
>   eglibc 2.12.
>
> 2. Places the binaries into a tmpfs (that's important for triggering bug
>   in a short time; NFS or MTD takes longer)
>
>    mount -t tmpfs -o size=8m none /tmp
>    mkdir /tmp/bin /tmp/lib
>    cp /bin/busybox /tmp/bin/
>    cp /lib/ld[.-]* /tmp/lib
>    cp /lib/lib[cm][.-]* /tmp/lib
>    ln -s busybox /tmp/bin/sh
>    ln -s busybox /tmp/bin/sed
>
> 3. Create a testscript like
>
>    cat << EOF > /tmp/x
>    #! /bin/sh
>
>    while test `echo abcdefghijkl |
>          sed 's!.$!!' | sed 's!.$!!' | sed 's!.$!!' | \
>          sed 's!.$!!' | sed 's!.$!!' | sed 's!.$!!' | \
>          sed 's!.$!!' | sed 's!^.!!' | sed 's!^.!!'` = cde; do
>           :
>    done
>    EOF
>
> 4. Go into /tmp chroot and execute script
>
>    chroot /tmp
>    sh /x
>
>
> After some time, you will get either the assertion or a segfault.
>
>
>
> Analysis
> ========
>
> Segfault with 'user_debug=-1' line gives
>
> [ 2280.392473] sed: unhandled page fault (11) at 0x40026200, code 0x017
> [ 2280.392500] pgd = c7bf0000
> [ 2280.395191] [40026200] *pgd=87b57031, *pte=00000000, *ppte=00000000
> [ 2280.401529]
> [ 2280.408893] Pid: 8831, comm:                  sed
> [ 2280.418649] CPU: 0    Not tainted  (2.6.31.13 #1)
> [ 2280.424497] PC is at 0x40016a84
> [ 2280.427610] LR is at 0x40014b0c
> [ 2280.430724] pc : [<40016a84>]    lr : [<40014b0c>]    psr: 20000010
> [ 2280.430733] sp : befff8f8  ip : befff910  fp : befff99c
> [ 2280.460747] r10: 00000000  r9 : 6474e552  r8 : 00000004
> [ 2280.466537] r7 : 00000000  r6 : 00000012  r5 : 00000201  r4 : 40026202
> [ 2280.473452] r3 : befff8f8  r2 : 00000009  r1 : 40026200  r0 : 40026202
> [ 2280.479937] Flags: nzCv  IRQs on  FIQs on  Mode USER_32  ISA ARM
>  Segment user
> [ 2280.487354] Control: 0400397f  Table: 87bf0018  DAC: 00000015
>
>  Register -> Variable mapping:
>  -----------------------------
>
>    r6,r7:   'masked'    -->  strange: it seems to be always 0x12
>    r8:      'm'
>    r5:      'n'
>    r2:      'masked' >> 'n & 0xff'  (r3 destroyed by code calling strlen())
>
>
> The PC is strlen() and was called from LR _dl_important_hwcaps():
>
> 00016a80 <strlen>:
>   16a80:       e3c01003        bic     r1, r0, #3      ; 0x3
>   16a84:       e4912004        ldr     r2, [r1], #4
>
> 000147ec <_dl_important_hwcaps>:
>   14aa4:       e1961007        orrs    r1, r6, r7      <<<< the two 32
>                                                        bit words of
> 'masked'
>   14aa8:       0a000023        beq     14b3c <_dl_important_hwcaps+0x350>
>   14aac:       e51b4068        ldr     r4, [fp, #-104]
>   14ab0:       e51b206c        ldr     r2, [fp, #-108]
>   14ab4:       e3a00001        mov     r0, #1  ; 0x1
>   14ab8:       e3a01000        mov     r1, #0  ; 0x0
>   14abc:       e084e002        add     lr, r4, r2
>   14ac0:       e28e4050        add     r4, lr, #80     ; 0x50
>   14ac4:       e3a05000        mov     r5, #0  ; 0x0   <<<< this is 'n'
>   14ac8:       ec41000a        tmcrr   wr10, r0, r1
>   14acc:       ea000003        b       14ae0 <_dl_important_hwcaps+0x2f4>
>   14ad0:       e1961007        orrs    r1, r6, r7      <<<< start of loop
>   14ad4:       e284400a        add     r4, r4, #10     ; 0xa
>   14ad8:       0a000017        beq     14b3c <_dl_important_hwcaps+0x350>
>   14adc:       e2855001        add     r5, r5, #1      ; 0x1
>   14ae0:       ec476004        tmcrr   wr4, r6, r7     <<<< move 'masked'
> into cp
>   14ae4:       ee085110        tmcr    wcgr0, r5
>   14ae8:       eee40148        wsrldg  wr0, wr4, wcgr0 <<<< right shift
>                                                        of 'masked' for n &
> 0xff bits
>   14aec:       ec532000        mra     r2, r3, acc0    <<< this is
> tricky...
>                                                        acc0 is wr0
>   14af0:       e2020001        and     r0, r2, #1      ; 0x1
>   14af4:       e3500000        cmp     r0, #0  ; 0x0   <<<< that's the 'if
> (...'
>   14af8:       0afffff4        beq     14ad0 <_dl_important_hwcaps+0x2e4>
>   14afc:       e51b3044        ldr     r3, [fp, #-68]
>   14b00:       e1a00004        mov     r0, r4
>   14b04:       e7834188        str     r4, [r3, r8, lsl #3]
> >> 14b08:       eb0007dc        bl      16a80 <strlen>
>   14b0c:       e51b1044        ldr     r1, [fp, #-68]
>   14b10:       ee095110        tmcr    wcgr1, r5       <<<< move 'n' into
> cp
>   14b14:       eeda5149        wslldg  wr5, wr10, wcgr1 <<< left shift of 1
> for n & 0xff bits
>   14b18:       e081c188        add     ip, r1, r8, lsl #3
>   14b1c:       e58c0004        str     r0, [ip, #4]
>   14b20:       ec510005        tmrrc   r0, r1, wr5     <<<< split 64 bit
> word
>   14b24:       e0266000        eor     r6, r6, r0      <<<< and do the ^=
>   14b28:       e0277001        eor     r7, r7, r1
>   14b2c:       e1961007        orrs    r1, r6, r7
>   14b30:       e2888001        add     r8, r8, #1      <<<< this is 'm'
>   14b34:       e284400a        add     r4, r4, #10
>   14b38:       1affffe7        bne     14adc <_dl_important_hwcaps+0x2f0>
>
> The corresponding code is
>
> ------------
>  uint64_t masked = GLRO(dl_hwcap) & GLRO(dl_hwcap_mask);
>
>  for (n = 0; masked != 0; ++n)
>    if ((masked & (1ULL << n)) != 0)
>      {
>        temp[m].str = _dl_hwcap_string (n);
>        temp[m].len = strlen (temp[m].str);
>        masked ^= 1ULL << n;
>        ++m;
>      }
> ------------
>
>
> The crash happens because 'n' (stored in r5) becomes 0x201 and an array
> outside of the mapped memory is accessed.
>
> The value of 0x201 means that loop has missed its end condition twice
> (first time for 0x01 and second time for 0x101).
>
> The assertion happens probably when loop exits in the second round
> (0x101).
>
>
> Generated assembly code looks sane to me and I do not see how 'n' can
> become >64 (or I looked to long at it and missed the obvious).  Within
> the loop, only strlen() is called and this alters r0-r3 only.
>
> As I wrote above, 'masked' seems to be always '0x12' which means that
> always the same bits have not been cleared.
>
>
> Does somebody has other ideas or a solution?
>
>
>
> Enrico
>
> Footnotes:
> [1]  http://sourceware.org/bugzilla/show_bug.cgi?id=6729
> [2]  http://bugs.gentoo.org/show_bug.cgi?id=194973
> [3]  https://www.cvg.de/people/ensc/iwmmxt-ld.tar.gz
>
>
> _______________________________________________
> linux-arm mailing list
> linux-arm at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.infradead.org/pipermail/linux-arm/attachments/20100622/2b03862c/attachment-0001.html>


More information about the linux-arm mailing list