Random failures while loading iwmmxt compiled binaries

Marek Vasut marek.vasut at gmail.com
Tue Jun 22 14:24:43 EDT 2010


Dne Út 22. června 2010 18:15:33 Enrico Scholz napsal(a):
> Hi,
> 
> I have the problem that program loading segfaults sometimes or aborts
> with

CCed interested parties

> 
> | Inconsistency detected by ld.so: ../elf/dl-sysdep.c: 465:
> | _dl_important_hwcaps: Assertion `m == cnt' f
> 
> ailed!
> 
> There are existing similar reports in the glibc[1] or gentoo[2] bugtrackers
> and it happens very seldom during normal usage.
> 
> I analyzed it partially and can reproduce it in <10 minutes on PXA270
> and PXA320 platforms, but I am unable to find a solution yet.  These
> platforms were running with kernel 2.6.34 (PXA270) and 2.6.31 (PXA320).
> 
> 
> Conclusions:
> ============
> 
> * it is not a compiler bug
> 
> * it is not a eglibc bug
> 
> * it can be only the kernel (iwmmxt state not restored properly between
>   task switching?) or a bug in the silicon
> 
> 
> Steps to reproduce it:
> ======================
> 
> 1. Build (e)glibc and busybox with -march=iwmmxt -mcpu=iwmmxt
> -mtune=iwmmxt; I placed binaries at [3] which are based on OpenEmbedded's
> gcc 4.3.4 and eglibc 2.12.
> 
> 2. Places the binaries into a tmpfs (that's important for triggering bug
>    in a short time; NFS or MTD takes longer)
> 
>     mount -t tmpfs -o size=8m none /tmp
>     mkdir /tmp/bin /tmp/lib
>     cp /bin/busybox /tmp/bin/
>     cp /lib/ld[.-]* /tmp/lib
>     cp /lib/lib[cm][.-]* /tmp/lib
>     ln -s busybox /tmp/bin/sh
>     ln -s busybox /tmp/bin/sed
> 
> 3. Create a testscript like
> 
>     cat << EOF > /tmp/x
>     #! /bin/sh
> 
>     while test `echo abcdefghijkl |
>           sed 's!.$!!' | sed 's!.$!!' | sed 's!.$!!' | \
>           sed 's!.$!!' | sed 's!.$!!' | sed 's!.$!!' | \
>           sed 's!.$!!' | sed 's!^.!!' | sed 's!^.!!'` = cde; do
> 
>     done
>     EOF
> 
> 4. Go into /tmp chroot and execute script
> 
>     chroot /tmp
>     sh /x
> 
> 
> After some time, you will get either the assertion or a segfault.
> 
> 
> 
> Analysis
> ========
> 
> Segfault with 'user_debug=-1' line gives
> 
> [ 2280.392473] sed: unhandled page fault (11) at 0x40026200, code 0x017
> [ 2280.392500] pgd = c7bf0000
> [ 2280.395191] [40026200] *pgd=87b57031, *pte=00000000, *ppte=00000000
> [ 2280.401529]
> [ 2280.408893] Pid: 8831, comm:                  sed
> [ 2280.418649] CPU: 0    Not tainted  (2.6.31.13 #1)
> [ 2280.424497] PC is at 0x40016a84
> [ 2280.427610] LR is at 0x40014b0c
> [ 2280.430724] pc : [<40016a84>]    lr : [<40014b0c>]    psr: 20000010
> [ 2280.430733] sp : befff8f8  ip : befff910  fp : befff99c
> [ 2280.460747] r10: 00000000  r9 : 6474e552  r8 : 00000004
> [ 2280.466537] r7 : 00000000  r6 : 00000012  r5 : 00000201  r4 : 40026202
> [ 2280.473452] r3 : befff8f8  r2 : 00000009  r1 : 40026200  r0 : 40026202
> [ 2280.479937] Flags: nzCv  IRQs on  FIQs on  Mode USER_32  ISA ARM 
> Segment user [ 2280.487354] Control: 0400397f  Table: 87bf0018  DAC:
> 00000015
> 
>   Register -> Variable mapping:
>   -----------------------------
> 
>     r6,r7:   'masked'    -->  strange: it seems to be always 0x12
>     r8:      'm'
>     r5:      'n'
>     r2:      'masked' >> 'n & 0xff'  (r3 destroyed by code calling
> strlen())
> 
> 
> The PC is strlen() and was called from LR _dl_important_hwcaps():
> 
> 00016a80 <strlen>:
>    16a80:       e3c01003        bic     r1, r0, #3      ; 0x3
>    16a84:       e4912004        ldr     r2, [r1], #4
> 
> 000147ec <_dl_important_hwcaps>:
>    14aa4:       e1961007        orrs    r1, r6, r7	<<<< the two 32
>                                                         bit words of
> 'masked' 14aa8:       0a000023        beq     14b3c
> <_dl_important_hwcaps+0x350> 14aac:       e51b4068        ldr     r4, [fp,
> #-104]
>    14ab0:       e51b206c        ldr     r2, [fp, #-108]
>    14ab4:       e3a00001        mov     r0, #1  ; 0x1
>    14ab8:       e3a01000        mov     r1, #0  ; 0x0
>    14abc:       e084e002        add     lr, r4, r2
>    14ac0:       e28e4050        add     r4, lr, #80     ; 0x50
>    14ac4:       e3a05000        mov     r5, #0  ; 0x0   <<<< this is 'n'
>    14ac8:       ec41000a        tmcrr   wr10, r0, r1
>    14acc:       ea000003        b       14ae0 <_dl_important_hwcaps+0x2f4>
>    14ad0:       e1961007        orrs    r1, r6, r7      <<<< start of loop
>    14ad4:       e284400a        add     r4, r4, #10     ; 0xa
>    14ad8:       0a000017        beq     14b3c <_dl_important_hwcaps+0x350>
>    14adc:       e2855001        add     r5, r5, #1      ; 0x1
>    14ae0:       ec476004        tmcrr   wr4, r6, r7     <<<< move 'masked'
> into cp 14ae4:       ee085110        tmcr    wcgr0, r5
>    14ae8:       eee40148        wsrldg  wr0, wr4, wcgr0 <<<< right shift
>                                                         of 'masked' for n &
> 0xff bits 14aec:       ec532000        mra     r2, r3, acc0    <<< this is
> tricky... acc0 is wr0 14af0:       e2020001        and     r0, r2, #1     
> ; 0x1
>    14af4:       e3500000        cmp     r0, #0  ; 0x0   <<<< that's the 'if
> (...' 14af8:       0afffff4        beq     14ad0
> <_dl_important_hwcaps+0x2e4> 14afc:       e51b3044        ldr     r3, [fp,
> #-68]
>    14b00:       e1a00004        mov     r0, r4
>    14b04:       e7834188        str     r4, [r3, r8, lsl #3]
> 
> >> 14b08:       eb0007dc        bl      16a80 <strlen>
> 
>    14b0c:       e51b1044        ldr     r1, [fp, #-68]
>    14b10:       ee095110        tmcr    wcgr1, r5       <<<< move 'n' into
> cp 14b14:       eeda5149        wslldg  wr5, wr10, wcgr1 <<< left shift of
> 1 for n & 0xff bits 14b18:       e081c188        add     ip, r1, r8, lsl
> #3
>    14b1c:       e58c0004        str     r0, [ip, #4]
>    14b20:       ec510005        tmrrc   r0, r1, wr5     <<<< split 64 bit
> word 14b24:       e0266000        eor     r6, r6, r0      <<<< and do the
> ^= 14b28:       e0277001        eor     r7, r7, r1
>    14b2c:       e1961007        orrs    r1, r6, r7
>    14b30:       e2888001        add     r8, r8, #1      <<<< this is 'm'
>    14b34:       e284400a        add     r4, r4, #10
>    14b38:       1affffe7        bne     14adc <_dl_important_hwcaps+0x2f0>
> 
> The corresponding code is
> 
> ------------
>   uint64_t masked = GLRO(dl_hwcap) & GLRO(dl_hwcap_mask);
> 
>   for (n = 0; masked != 0; ++n)
>     if ((masked & (1ULL << n)) != 0)
>       {
> 	temp[m].str = _dl_hwcap_string (n);
> 	temp[m].len = strlen (temp[m].str);
> 	masked ^= 1ULL << n;
> 	++m;
>       }
> ------------
> 
> 
> The crash happens because 'n' (stored in r5) becomes 0x201 and an array
> outside of the mapped memory is accessed.
> 
> The value of 0x201 means that loop has missed its end condition twice
> (first time for 0x01 and second time for 0x101).
> 
> The assertion happens probably when loop exits in the second round
> (0x101).
> 
> 
> Generated assembly code looks sane to me and I do not see how 'n' can
> become >64 (or I looked to long at it and missed the obvious).  Within
> the loop, only strlen() is called and this alters r0-r3 only.
> 
> As I wrote above, 'masked' seems to be always '0x12' which means that
> always the same bits have not been cleared.
> 
> 
> Does somebody has other ideas or a solution?
> 
> 
> 
> Enrico
> 
> Footnotes:
> [1]  http://sourceware.org/bugzilla/show_bug.cgi?id=6729
> [2]  http://bugs.gentoo.org/show_bug.cgi?id=194973
> [3]  https://www.cvg.de/people/ensc/iwmmxt-ld.tar.gz
> 
> 
> _______________________________________________
> linux-arm mailing list
> linux-arm at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm



More information about the linux-arm mailing list