[RESEND PATCH v4 8/8] arm64: Allow 64-bit tasks to invoke compat syscalls

Amanieu d'Antras amanieu at gmail.com
Tue May 18 16:51:00 PDT 2021


On Tue, May 18, 2021 at 2:03 PM Arnd Bergmann <arnd at kernel.org> wrote:
> I'm still undecided about this approach. It is an easy way to expose the 32-bit
> ABIs, it mostly copies what x86-64 already does with 32-bit syscalls and
> it doesn't expose a lot of attack surface that isn't already exposed to normal
> 32-bit tasks running compat mode.
>
> On the other hand, exposing the entire aarch32 syscall set seems both
> too broad and not broad enough: Half of the system calls behave the
> exact same way in native and compat mode, so they wouldn't need to
> be exposed like this, a lot of others are trivially emulated in user space
> by calling the native versions. The syscalls that are actually hard to do
> such as ioctl() or the signal handling will work for aarch32 emulation, but
> they are still insufficient to correctly emulate other 32-bit architectures
> that have a slightly different ABI. This means the interface is a fairly good
> fit for Tango, but much less so for FEX.
>
> It's also worth pointing out that this approach has a few things in common
> with Yury's ilp32 tree at https://github.com/norov/linux/tree/ilp32-5.2
> Unlike the x86 x32 mode, that port however does not allow calling compat
> syscalls from normal 64-bit tasks but rather keys the syscall entry point
> off the executable format., which wouldn't work here. It also uses the
> asm-generic system call numbers instead of the arm32 syscall numbers.
>
> I assume you have already considered or tried the alternative approach of
> only adding a minimal set of syscalls that are needed for the emulation.
> Having a way to limit the address space for mmap() and similar
> system calls sounds like a generally useful addition, and having an
> extended variant of ioctl() that lets you pick the target ABI (arm32, x86-32,
> ...) on supported drivers would probably be better for FEX. Can you
> explain the tradeoffs that led you towards duplicating the syscall
> entry points instead?

Tango needs the entire compat ABI to be exposed to support seccomp for
translated AArch32 processes. Here's how this works:

1. When a translated process installs a seccomp filter, Tango injects
a prefix into the seccomp program which effectively does:
    if (arch == AUDIT_ARCH_AARCH64) {
        // 64-bit syscalls used by Tango for internal operations
        if (syscall_in_tango_whitelist(nr))
            return SECCOMP_RET_ALLOW;
    }
    // continue to user-supplied seccomp program

2. When Tango performs a 32-bit syscall on behalf of the translated
process, the seccomp filter will see a syscall with AUDIT_ARCH_ARM and
the compat syscall number. This allows the user-supplied seccomp
filter to behave exactly as if it was running in a native AArch32
process.



More information about the linux-arm-kernel mailing list