Weirdo gcc 4.3.x behaviour

Russell King - ARM Linux linux at arm.linux.org.uk
Sun Nov 21 14:11:48 EST 2010


Has anyone else seen weird behaviour from the gcc 4.3 branch?

I've tried gcc-4.3.2 + patch and also gcc-4.3.5, both of which behave
differently with this change in the kernel:

(asm-generic/pgtable-nopud.h)
static inline int pgd_none(pgd_t pgd)           { return 0; }
static inline int pgd_bad(pgd_t pgd)            { return 0; }
static inline int pgd_present(pgd_t pgd)        { return 1; }

and subsequently adding:
+#define pgd_none(pgd)          0
+#define pgd_bad(pgd)           0
+#define pgd_present(pgd)       1

This causes completely unrelated functions to be optimized differently,
as can be seen via the bloat-o-meter:

add/remove: 0/0 grow/shrink: 3/1 up/down: 20/-4 (16)
function                                     old     new   delta
shmem_file_aio_read                          820     832     +12
select_task_rq_fair                         1824    1828      +4
kswapd                                      1816    1820      +4
elf_core_dump                               3264    3260      -4

The strange thing is that (eg) elf_core_dump() doesn't use any of these
macros - and the changes seem to be completely unrelated too:

c00f8cc8:       ea00018c        b       c00f9300 <elf_core_dump+0xc38>
c00f8ccc:       e3a05000        mov     r5, #0  ; 0x0
c00f8cd0:       e50b508c        str     r5, [fp, #-140]
c00f8cd4:       e50b5088        str     r5, [fp, #-136]
c00f8cd8:       ea000167        b       c00f927c <elf_core_dump+0xbb4>

vs

c00f8cb4:       ea00018d        b       c00f92f0 <elf_core_dump+0xc3c>
c00f8cb8:       e3a00000        mov     r0, #0  ; 0x0
c00f8cbc:       e1a05000        mov     r5, r0
c00f8cc0:       e50b008c        str     r0, [fp, #-140]
c00f8cc4:       e50b0088        str     r0, [fp, #-136]
c00f8cc8:       ea000167        b       c00f926c <elf_core_dump+0xbb8>

and:

c00f8d18:       e1a08000        mov     r8, r0
c00f8d1c:       e2580000        subs    r0, r8, #0      ; 0x0
c00f8d20:       e50b008c        str     r0, [fp, #-140]
c00f8d24:       050b0088        streq   r0, [fp, #-136]
c00f8d28:       0a00013e        beq     c00f9228 <elf_core_dump+0xb60>

vs

c00f8d08:       e1a08000        mov     r8, r0
c00f8d0c:       e2581000        subs    r1, r8, #0      ; 0x0
c00f8d10:       e50b108c        str     r1, [fp, #-140]
c00f8d14:       050b1088        streq   r1, [fp, #-136]
c00f8d18:       0a00013e        beq     c00f9218 <elf_core_dump+0xb64>

The changes seem to be minor register allocation differences, but what
worries me is why should this change when these functions don't go
anywhere near the page table accessors.

I've had these kinds of changes happen in snd_pcm_hw_rule_ratdens(),
which definitely has nothing to do with page tables as a result of
changing the way page tables are accessed.

The issue here is that with this noise created by the compiler, it's
not possible to properly evaluate the effect of these kinds of changes
to the page tables - and I'm wondering if this could be related to the
mysterious crashes and MMC read corruption I've seen on Versatile due
to the L2x0 changes which have been appearing and disappearing on me
recently.

What I'm worried about is if unrelated functions can influence the
optimization of following functions, is it possible that some
optimizations are mis-applied, leading to incorrect code generation -
which would make gcc 4.3 an untrustworthy compiler.



More information about the linux-arm-kernel mailing list