[PATCH v2 07/13] asm-generic: unaligned always use struct helpers

Arnd Bergmann arnd at kernel.org
Tue May 18 08:41:00 PDT 2021


On Tue, May 18, 2021 at 4:56 PM Linus Torvalds
<torvalds at linux-foundation.org> wrote:
>
> On Tue, May 18, 2021 at 12:27 AM Arnd Bergmann <arnd at kernel.org> wrote:
> > >
> > > I wonder if the kernel should do the same, or whether there are still cases
> > > where memcpy() isn't compiled optimally.  armv6/7 used to be one such case, but
> > > it was fixed in gcc 6.
> >
> > It would have to be memmove(), not memcpy() in this case, right?
>
> No, it would simply be something like
>
>   #define __get_unaligned_t(type, ptr) \
>         ({ type __val; memcpy(&__val, ptr, sizeof(type)); __val; })
>
>   #define get_unaligned(ptr) \
>         __get_unaligned_t(typeof(*(ptr)), ptr)
>
> but honestly, the likelihood that the compiler generates something
> horrible (possibly because of KASAN etc) is uncomfortably high.
>
> I'd prefer the __packed thing. We don't actually use -O3, and it's
> considered a bad idea, and the gcc bug is as such less likely than
> just  the above generating unacceptable code (we have several cases
> where "bad code generation" ends up being an actual bug, since we
> depend on inlining and depend on some code sequences not generating
> calls etc).

I think the important question is whether we know that the bug that Eric
pointed to can only happen with -O3, or whether it is something in
gcc-10.1 that got triggered by -O3 -msse on x86-64 but could equally
well get triggered on some other architecture without -O3 or
vector instructions enabled.

>From the gcc fix at
https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=9fa5b473b5b8e289b
it looks like this code path is entered when compiling with
-ftree-loop-vectorize, which is documented as

'-ftree-loop-vectorize'
     Perform loop vectorization on trees.  This flag is enabled by
     default at '-O3' and by '-ftree-vectorize', '-fprofile-use', and
     '-fauto-profile'.

-ftree-vectorize is set in arch/arm/lib/xor-neon.c
-O3 is set for the lz4 and zstd compression helpers and for wireguard.

To be on the safe side, we could pass -fno-tree-loop-vectorize along
with -O3 on the affected gcc versions, or use a bigger hammer
(not use -O3 at all, always set -fno-tree-loop-vectorize, ...).

        Arnd



More information about the linux-arm-kernel mailing list