Random failures while loading iwmmxt compiled binaries
Marek Vasut
marek.vasut at gmail.com
Tue Jun 22 14:24:43 EDT 2010
Dne Út 22. června 2010 18:15:33 Enrico Scholz napsal(a):
> Hi,
>
> I have the problem that program loading segfaults sometimes or aborts
> with
CCed interested parties
>
> | Inconsistency detected by ld.so: ../elf/dl-sysdep.c: 465:
> | _dl_important_hwcaps: Assertion `m == cnt' f
>
> ailed!
>
> There are existing similar reports in the glibc[1] or gentoo[2] bugtrackers
> and it happens very seldom during normal usage.
>
> I analyzed it partially and can reproduce it in <10 minutes on PXA270
> and PXA320 platforms, but I am unable to find a solution yet. These
> platforms were running with kernel 2.6.34 (PXA270) and 2.6.31 (PXA320).
>
>
> Conclusions:
> ============
>
> * it is not a compiler bug
>
> * it is not a eglibc bug
>
> * it can be only the kernel (iwmmxt state not restored properly between
> task switching?) or a bug in the silicon
>
>
> Steps to reproduce it:
> ======================
>
> 1. Build (e)glibc and busybox with -march=iwmmxt -mcpu=iwmmxt
> -mtune=iwmmxt; I placed binaries at [3] which are based on OpenEmbedded's
> gcc 4.3.4 and eglibc 2.12.
>
> 2. Places the binaries into a tmpfs (that's important for triggering bug
> in a short time; NFS or MTD takes longer)
>
> mount -t tmpfs -o size=8m none /tmp
> mkdir /tmp/bin /tmp/lib
> cp /bin/busybox /tmp/bin/
> cp /lib/ld[.-]* /tmp/lib
> cp /lib/lib[cm][.-]* /tmp/lib
> ln -s busybox /tmp/bin/sh
> ln -s busybox /tmp/bin/sed
>
> 3. Create a testscript like
>
> cat << EOF > /tmp/x
> #! /bin/sh
>
> while test `echo abcdefghijkl |
> sed 's!.$!!' | sed 's!.$!!' | sed 's!.$!!' | \
> sed 's!.$!!' | sed 's!.$!!' | sed 's!.$!!' | \
> sed 's!.$!!' | sed 's!^.!!' | sed 's!^.!!'` = cde; do
>
> done
> EOF
>
> 4. Go into /tmp chroot and execute script
>
> chroot /tmp
> sh /x
>
>
> After some time, you will get either the assertion or a segfault.
>
>
>
> Analysis
> ========
>
> Segfault with 'user_debug=-1' line gives
>
> [ 2280.392473] sed: unhandled page fault (11) at 0x40026200, code 0x017
> [ 2280.392500] pgd = c7bf0000
> [ 2280.395191] [40026200] *pgd=87b57031, *pte=00000000, *ppte=00000000
> [ 2280.401529]
> [ 2280.408893] Pid: 8831, comm: sed
> [ 2280.418649] CPU: 0 Not tainted (2.6.31.13 #1)
> [ 2280.424497] PC is at 0x40016a84
> [ 2280.427610] LR is at 0x40014b0c
> [ 2280.430724] pc : [<40016a84>] lr : [<40014b0c>] psr: 20000010
> [ 2280.430733] sp : befff8f8 ip : befff910 fp : befff99c
> [ 2280.460747] r10: 00000000 r9 : 6474e552 r8 : 00000004
> [ 2280.466537] r7 : 00000000 r6 : 00000012 r5 : 00000201 r4 : 40026202
> [ 2280.473452] r3 : befff8f8 r2 : 00000009 r1 : 40026200 r0 : 40026202
> [ 2280.479937] Flags: nzCv IRQs on FIQs on Mode USER_32 ISA ARM
> Segment user [ 2280.487354] Control: 0400397f Table: 87bf0018 DAC:
> 00000015
>
> Register -> Variable mapping:
> -----------------------------
>
> r6,r7: 'masked' --> strange: it seems to be always 0x12
> r8: 'm'
> r5: 'n'
> r2: 'masked' >> 'n & 0xff' (r3 destroyed by code calling
> strlen())
>
>
> The PC is strlen() and was called from LR _dl_important_hwcaps():
>
> 00016a80 <strlen>:
> 16a80: e3c01003 bic r1, r0, #3 ; 0x3
> 16a84: e4912004 ldr r2, [r1], #4
>
> 000147ec <_dl_important_hwcaps>:
> 14aa4: e1961007 orrs r1, r6, r7 <<<< the two 32
> bit words of
> 'masked' 14aa8: 0a000023 beq 14b3c
> <_dl_important_hwcaps+0x350> 14aac: e51b4068 ldr r4, [fp,
> #-104]
> 14ab0: e51b206c ldr r2, [fp, #-108]
> 14ab4: e3a00001 mov r0, #1 ; 0x1
> 14ab8: e3a01000 mov r1, #0 ; 0x0
> 14abc: e084e002 add lr, r4, r2
> 14ac0: e28e4050 add r4, lr, #80 ; 0x50
> 14ac4: e3a05000 mov r5, #0 ; 0x0 <<<< this is 'n'
> 14ac8: ec41000a tmcrr wr10, r0, r1
> 14acc: ea000003 b 14ae0 <_dl_important_hwcaps+0x2f4>
> 14ad0: e1961007 orrs r1, r6, r7 <<<< start of loop
> 14ad4: e284400a add r4, r4, #10 ; 0xa
> 14ad8: 0a000017 beq 14b3c <_dl_important_hwcaps+0x350>
> 14adc: e2855001 add r5, r5, #1 ; 0x1
> 14ae0: ec476004 tmcrr wr4, r6, r7 <<<< move 'masked'
> into cp 14ae4: ee085110 tmcr wcgr0, r5
> 14ae8: eee40148 wsrldg wr0, wr4, wcgr0 <<<< right shift
> of 'masked' for n &
> 0xff bits 14aec: ec532000 mra r2, r3, acc0 <<< this is
> tricky... acc0 is wr0 14af0: e2020001 and r0, r2, #1
> ; 0x1
> 14af4: e3500000 cmp r0, #0 ; 0x0 <<<< that's the 'if
> (...' 14af8: 0afffff4 beq 14ad0
> <_dl_important_hwcaps+0x2e4> 14afc: e51b3044 ldr r3, [fp,
> #-68]
> 14b00: e1a00004 mov r0, r4
> 14b04: e7834188 str r4, [r3, r8, lsl #3]
>
> >> 14b08: eb0007dc bl 16a80 <strlen>
>
> 14b0c: e51b1044 ldr r1, [fp, #-68]
> 14b10: ee095110 tmcr wcgr1, r5 <<<< move 'n' into
> cp 14b14: eeda5149 wslldg wr5, wr10, wcgr1 <<< left shift of
> 1 for n & 0xff bits 14b18: e081c188 add ip, r1, r8, lsl
> #3
> 14b1c: e58c0004 str r0, [ip, #4]
> 14b20: ec510005 tmrrc r0, r1, wr5 <<<< split 64 bit
> word 14b24: e0266000 eor r6, r6, r0 <<<< and do the
> ^= 14b28: e0277001 eor r7, r7, r1
> 14b2c: e1961007 orrs r1, r6, r7
> 14b30: e2888001 add r8, r8, #1 <<<< this is 'm'
> 14b34: e284400a add r4, r4, #10
> 14b38: 1affffe7 bne 14adc <_dl_important_hwcaps+0x2f0>
>
> The corresponding code is
>
> ------------
> uint64_t masked = GLRO(dl_hwcap) & GLRO(dl_hwcap_mask);
>
> for (n = 0; masked != 0; ++n)
> if ((masked & (1ULL << n)) != 0)
> {
> temp[m].str = _dl_hwcap_string (n);
> temp[m].len = strlen (temp[m].str);
> masked ^= 1ULL << n;
> ++m;
> }
> ------------
>
>
> The crash happens because 'n' (stored in r5) becomes 0x201 and an array
> outside of the mapped memory is accessed.
>
> The value of 0x201 means that loop has missed its end condition twice
> (first time for 0x01 and second time for 0x101).
>
> The assertion happens probably when loop exits in the second round
> (0x101).
>
>
> Generated assembly code looks sane to me and I do not see how 'n' can
> become >64 (or I looked to long at it and missed the obvious). Within
> the loop, only strlen() is called and this alters r0-r3 only.
>
> As I wrote above, 'masked' seems to be always '0x12' which means that
> always the same bits have not been cleared.
>
>
> Does somebody has other ideas or a solution?
>
>
>
> Enrico
>
> Footnotes:
> [1] http://sourceware.org/bugzilla/show_bug.cgi?id=6729
> [2] http://bugs.gentoo.org/show_bug.cgi?id=194973
> [3] https://www.cvg.de/people/ensc/iwmmxt-ld.tar.gz
>
>
> _______________________________________________
> linux-arm mailing list
> linux-arm at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm
More information about the linux-arm
mailing list