gcc miscompiles csum_tcpudp_magic() on ARMv5

Russell King - ARM Linux linux at arm.linux.org.uk
Thu Dec 12 10:00:27 EST 2013


On Thu, Dec 12, 2013 at 03:52:43PM +0100, Maxime Bizon wrote:
> 
> On Thu, 2013-12-12 at 14:42 +0000, Måns Rullgård wrote:
> > 
> > Again, that's an optimisation that does not alter the semantics of the
> > code. Although the generated code looks very different, it does the
> > same thing.
> > 
> It cannot do the same thing as there are possibly nothing to do after
> inline.
> 
> 
> static __attribute__((noinline)) unsigned int do_nothing(unsigned char foo)                                 
> {
>         foo += 42;
>         return 0;
> }
> 
> int func(int a)
> {
>         return do_nothing(a);
> }
> 
> 00000000 <do_nothing>:
>    0:	e3a00000 	mov	r0, #0
>    4:	e12fff1e 	bx	lr
> 
> 00000008 <func>:
>    8:	e52de004 	push	{lr}		; (str lr, [sp, #-4]!)
>    c:	e24dd004 	sub	sp, sp, #4
>   10:	e20000ff 	and	r0, r0, #255	; 0xff
>   14:	ebfffff9 	bl	0 <do_nothing>
>   18:	e28dd004 	add	sp, sp, #4
>   1c:	e8bd8000 	ldmfd	sp!, {pc}
> 
> 
> static inline unsigned int do_nothing(unsigned char foo)                                 
> {
>         foo += 42;
>         return 0;
> }
> 
> int func(int a)
> {
>         return do_nothing(a);
> }
> 
> 
> 00000000 <func>:
>    0:	e3a00000 	mov	r0, #0
>    4:	e12fff1e 	bx	lr
> 
> 
> In the first case, the compiler narrows "int a" to char and call the
> uninlined function.
> 
> In the second case, there is absolutely no generated code to push any
> arguments as the function that does nothing is inlined into func().

This is different - the compiler has recognised in both cases that the
addition od 42 to foo is useless as the result is not used, and therefore
has optimised the addition away.  In the second case, it has realised that
the narrowing cast used to then add 42 to is also not used, and it has
also optimised that away.

A better test case would be do to do this:

	foo += 42;
	return foo;

so that "foo" is actually used.  Or, if you don't feel happy with that:

extern void use_result(unsigned int);

	foo += 42;
	use_result(foo);
	return 0;

so that the compiler can't decide that 'foo' is never used.



More information about the linux-arm-kernel mailing list