[RFC PATCH 05/13] x86/um: nommu: syscall translation by zpoline

Johannes Berg johannes at sipsolutions.net
Fri Oct 25 02:19:25 PDT 2024


On Thu, 2024-10-24 at 21:09 +0900, Hajime Tazaki wrote:
> This commit adds a mechanism to hook syscalls for unmodified userspace
> programs used under UML in !MMU mode. The mechanism, called zpoline,
> translates syscall/sysenter instructions with `call *%rax`, which can be
> processed by a trampoline code also installed upon an initcall during
> boot. The translation is triggered by elf_arch_finalize_exec(), an arch
> hook introduced by another commit.
> 
> All syscalls issued by userspace thus redirected to a speicific function,

typo: "specific"

> +	if (down_write_killable(&mm->mmap_lock)) {
> +		err = -EINTR;
> +		return err;

?


What happens if the binary JITs some code and you don't find it? I don't
remember from your talk - there you seemed to say this was fine just
slow, but that was zpoline in a different context (container)?

Perhaps UML could additionally install a seccomp filter or something on
itself while running a userspace program? Hmm.


> +/**
> + * setup trampoline code for syscall hooks
> + *
> + * the trampoline code guides to call hooked function, __kernel_vsyscall
> + * in this case, via nop slides at the memory address zero (thus, zpoline).
> + *
> + * loaded binary by exec(2) is translated to call the function.
> + */
> +static int __init setup_zpoline_trampoline(void)
> +{
> +	int i, ret;
> +	int ptr;
> +
> +	/* zpoline: map area of trampoline code started from addr 0x0 */
> +	__zpoline_start = 0x0;
> +
> +	ret = os_map_memory((void *) 0, -1, 0, 0x1000, 1, 1, 1);

(UM_)PAGE_SIZE?

> +	/**
> +	 * FIXME: shit red zone area to properly handle the case

"shift"? :)

> +	 */
> +
> +	/**
> +	 * put code for jumping to __kernel_vsyscall.
> +	 *
> +	 * here we embed the following code.
> +	 *
> +	 * movabs [$addr],%r11
> +	 * jmpq   *%r11
> +	 *
> +	 */
> +	ptr = NR_syscalls;
> +	/* 49 bb [64-bit addr (8-byte)]    movabs [64-bit addr (8-byte)],%r11 */
> +	__zpoline_start[ptr++] = 0x49;
> +	__zpoline_start[ptr++] = 0xbb;
> +	__zpoline_start[ptr++] = ((uint64_t)
> +				  __kernel_vsyscall >> (8 * 0)) & 0xff;

&0xff seems pointless with a u8 array?

> +	/* permission: XOM (PROT_EXEC only) */
> +	ret = os_protect_memory(0, 0x1000, 0, 0, 1);

(UM_)PAGE_SIZE?

johannes



More information about the linux-um mailing list