[RFC PATCH 05/13] x86/um: nommu: syscall translation by zpoline
Johannes Berg
johannes at sipsolutions.net
Fri Oct 25 02:19:25 PDT 2024
On Thu, 2024-10-24 at 21:09 +0900, Hajime Tazaki wrote:
> This commit adds a mechanism to hook syscalls for unmodified userspace
> programs used under UML in !MMU mode. The mechanism, called zpoline,
> translates syscall/sysenter instructions with `call *%rax`, which can be
> processed by a trampoline code also installed upon an initcall during
> boot. The translation is triggered by elf_arch_finalize_exec(), an arch
> hook introduced by another commit.
>
> All syscalls issued by userspace thus redirected to a speicific function,
typo: "specific"
> + if (down_write_killable(&mm->mmap_lock)) {
> + err = -EINTR;
> + return err;
?
What happens if the binary JITs some code and you don't find it? I don't
remember from your talk - there you seemed to say this was fine just
slow, but that was zpoline in a different context (container)?
Perhaps UML could additionally install a seccomp filter or something on
itself while running a userspace program? Hmm.
> +/**
> + * setup trampoline code for syscall hooks
> + *
> + * the trampoline code guides to call hooked function, __kernel_vsyscall
> + * in this case, via nop slides at the memory address zero (thus, zpoline).
> + *
> + * loaded binary by exec(2) is translated to call the function.
> + */
> +static int __init setup_zpoline_trampoline(void)
> +{
> + int i, ret;
> + int ptr;
> +
> + /* zpoline: map area of trampoline code started from addr 0x0 */
> + __zpoline_start = 0x0;
> +
> + ret = os_map_memory((void *) 0, -1, 0, 0x1000, 1, 1, 1);
(UM_)PAGE_SIZE?
> + /**
> + * FIXME: shit red zone area to properly handle the case
"shift"? :)
> + */
> +
> + /**
> + * put code for jumping to __kernel_vsyscall.
> + *
> + * here we embed the following code.
> + *
> + * movabs [$addr],%r11
> + * jmpq *%r11
> + *
> + */
> + ptr = NR_syscalls;
> + /* 49 bb [64-bit addr (8-byte)] movabs [64-bit addr (8-byte)],%r11 */
> + __zpoline_start[ptr++] = 0x49;
> + __zpoline_start[ptr++] = 0xbb;
> + __zpoline_start[ptr++] = ((uint64_t)
> + __kernel_vsyscall >> (8 * 0)) & 0xff;
&0xff seems pointless with a u8 array?
> + /* permission: XOM (PROT_EXEC only) */
> + ret = os_protect_memory(0, 0x1000, 0, 0, 1);
(UM_)PAGE_SIZE?
johannes
More information about the linux-um
mailing list