[RFC PATCH] arm64: compat: Implement misalignment fixups for multiword loads

Arnd Bergmann arnd at arndb.de
Tue Jun 28 06:22:49 PDT 2022


On Thu, Jun 23, 2022 at 7:24 PM Ard Biesheuvel <ardb at kernel.org> wrote:
>
> The 32-bit ARM kernel implements fixups on behalf of user space when
> using LDM/STM or LDRD/STRD instructions on addresses that are not 32-bit
> aligned. This is not something that is supported by the architecture,
> but was done anyway to increase compatibility with user space software,
> which mostly targeted x86 at the time and did not care about aligned
> accesses.
>
> This feature is one of the remaining impediments to being able to switch
> to 64-bit kernels on 64-bit capable hardware running 32-bit user space,
> so let's implement it for the arm64 compat layer as well.
>
> Note that the intent is to implement the exact same handling of
> misaligned multi-word loads and stores as the 32-bit kernel does,
> including what appears to be missing support for user space programs
> that rely on SETEND to switch to a different byte order and back. Also,
> like the 32-bit ARM version, we rely on the faulting address reported by
> the CPU to infer the memory address, instead of decoding the instruction
> fully to obtain this information.
>
> This implementation is taken from the 32-bit ARM tree, with all pieces
> removed that deal with instructions other than LDRD/STRD and LDM/STM, or
> that deal with alignment exceptions taken in kernel mode.
>
> Signed-off-by: Ard Biesheuvel <ardb at kernel.org>

Thanks a lot for implementing this! I know this has been a problem in
particular for Debian, as the lack of this emulation has prevented them
from migrating the build environment to modern 64-bit machines while
the remaining 32-bit boxes in the build infrastructure are suffering from
age.

I've added the debian arm list and some developers that are probably
interested in it.

> +config COMPAT_ALIGNMENT_FIXUPS
> +       bool "Fix up misaligned multi-word loads and stores in user space"
> +       default y

My initial thought was that we probably want both compile-time and
runtime switches for this, the same way that ARMV8_DEPRECATED
does, but keeping it simple with just a compile-time option is totally
fine as well, as far as I'm concerned.

If we end up wanting an runtime switch after all, it should probably
follow the interface from Documentation/arm64/legacy_instructions.rst,
though it's not obvious how to best do it, since the instructions are
otherwise available for aligned data as designed, and not deprecated
at all.

Adding calls to trace_instruction_emulation() might be helpful
to allow tracing the fixups the say way we do for setend, swp
and the cp15 barriers.

> + * *** NOTE ***
> + * This code is not portable to processors with late data abort handling.
> + */

I see the comment is copied from the arm32 version. Is it actually relevant
for arm64 though?

> +static void *
> +do_alignment_t32_to_handler(u32 *pinstr, struct pt_regs *regs,
> +                           union offset_union *poffset)
> +{
> +       u32 instr = *pinstr;
> +       u16 tinst1 = (instr >> 16) & 0xffff;
> +       u16 tinst2 = instr & 0xffff;
> +
> +       switch (tinst1 & 0xffe0) {
> +       /* A6.3.5 Load/Store multiple */
> +       case 0xe880:            /* STM/STMIA/STMEA,LDM/LDMIA, PUSH/POP T2 */
> +       case 0xe8a0:            /* ...above writeback version */
> +       case 0xe900:            /* STMDB/STMFD, LDMDB/LDMEA */
> +       case 0xe920:            /* ...above writeback version */
> +               /* no need offset decision since handler calculates it */
> +               return do_alignment_ldmstm;
> +
> +       case 0xf840:            /* POP/PUSH T3 (single register) */
> +               if (RN_BITS(instr) == 13 && (tinst2 & 0x09ff) == 0x0904) {
> +                       u32 L = !!(LDST_L_BIT(instr));
> +                       const u32 subset[2] = {
> +                               0xe92d0000,     /* STMDB sp!,{registers} */
> +                               0xe8bd0000,     /* LDMIA sp!,{registers} */
> +                       };
> +                       *pinstr = subset[L] | (1<<RD_BITS(instr));
> +                       return do_alignment_ldmstm;
> +               }

The code clearly shows its age here, I doubt we'd do the function pointer
handling the same way these days, but I think you made the right call here
in keeping close to the original version while removing most of the irrelevant
cases.

> +static int alignment_get_arm(struct pt_regs *regs, u32 __user *ip, u32 *inst)
> +{
> +       u32 instr = 0;
> +       int fault;
> +
> +       fault = get_user(instr, ip);
> +       if (fault)
> +               return fault;
> +
> +       *inst = __le32_to_cpu(instr);
> +       return 0;
> +}
> +
> +static int alignment_get_thumb(struct pt_regs *regs, u16 __user *ip, u16 *inst)
> +{
> +       u16 instr = 0;

I think the types need to be adjusted, e.g. s/u32/__le32/ to avoid sparse
warnings.

        Arnd



More information about the linux-arm-kernel mailing list