Build failure with v4.9-rc1 and GCC trunk -- compiler weirdness

Will Deacon will.deacon at arm.com
Wed Oct 19 06:35:00 PDT 2016


Hi Ard,

On Mon, Oct 17, 2016 at 08:43:19PM +0100, Ard Biesheuvel wrote:
> On 17 October 2016 at 19:38, Will Deacon <will.deacon at arm.com> wrote:
> > I'm seeing an arm64 build failure with -rc1 and GCC trunk, although I
> > believe that the new compiler behaviour at the heart of the problem
> > has the potential to affect other architectures and other pieces of
> > kernel code relying on dead-code elimination to remove deliberately
> > undefined functions.
> >
> > The failure looks like:
> >
> >   | drivers/built-in.o: In function `armada_3700_add_composite_clk':
> >   |
> >   | linux/drivers/clk/mvebu/armada-37xx-periph.c:351:
> >   | undefined reference to `____ilog2_NaN'
> >   |
> >   | linux/drivers/clk/mvebu/armada-37xx-periph.c:351:(.text+0xc72e0):
> >   | relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol
> >   | `____ilog2_NaN'
> >   |
> >   | make: *** [vmlinux] Error 1
> >
> > and if we look at the source for armada_3700_add_composite_clk, we see
> > that this is caused by:
> >
> >   int table_size = 0;
> >
> >   rate->reg = reg + (u64)rate->reg;
> >   for (clkt = rate->table; clkt->div; clkt++)
> >          table_size++;
> >   rate->width = order_base_2(table_size);
> >
> > order_base_2 calls ilog2, which has the ____ilog2_NaN call:
> >
> > #define ilog2(n)                                \
> > (                                               \
> >         __builtin_constant_p(n) ? (             \
> >                 (n) < 1 ? ____ilog2_NaN() :     \
> >
> > This is because we're in a curious case where GCC has emitted a
> > special-cased version of armada_3700_add_composite_clk, with table_size
> > effectively constant-folded as 0. Whilst we shouldn't see this in a
> > non-buggy kernel (hence the deliberate call to the undefined function
> > ____ilog2_NaN), it means that the final link fails because we have a
> > ____ilog2_NaN in the code, with a runtime check on table_size.
> >
> 
> This is indeed an unintended side effect, but I would not call it
> weird behaviour at all. The code in its current form does not handle
> the case where it could end up passing 0 into order_base_2(), and we
> simply need to handle that case.

The reasons I think it's weird are:

  (1) The optimisation doesn't generate better code in this case --
      optimising for the table_size == 0 case is uninformed, particularly
      as that *cannot* happen at runtime (GCC probably can't tell, due
      to things like container_of, but all the clock data is static).

  (2) __builtin_constant_p(n) could be interpreted by a developer as
     "this code will execute with a constant n at runtime". With this
     issue, GCC could (in theory) generate a specialisation for every
     possible value of a variable, and return __builtin_constant_p as
     true for all of them, which somewhat undermines the point of the
     builtin.

> If order_base_2() is not defined for input 0, it should BUG() in that
> case, and the associated __builtin_unreachable() should prevent the
> special version from being emitted. If order_base_2() is defined for input
> 0, it should not invoke ilog2() with that argument, and the problem should
> go away as well.

I don't necessarily think it should BUG() if it's not defined for input
0; things like __ffs don't do that and we'd be introducing conditional
checks for cases that should not happen. The comment above order_base_2
does suggest that ob2(0) should return 0, but it can actually end up
invoking ilog2(-1), which is obviously wrong.

I could update the comment, but that doesn't fix the build issue.

Will



More information about the linux-arm-kernel mailing list