[PATCH] kernel: introduce prctl(PR_LOG_UACCESS)

David Hildenbrand david at redhat.com
Wed Sep 22 10:46:47 PDT 2021


On 22.09.21 08:18, Peter Collingbourne wrote:
> This patch introduces a kernel feature known as uaccess logging.
> With uaccess logging, the userspace program passes the address and size
> of a so-called uaccess buffer to the kernel via a prctl(). The prctl()
> is a request for the kernel to log any uaccesses made during the next
> syscall to the uaccess buffer. When the next syscall returns, the address
> one past the end of the logged uaccess buffer entries is written to the
> location specified by the third argument to the prctl(). In this way,
> the userspace program may enumerate the uaccesses logged to the access
> buffer to determine which accesses occurred.
> 
> Uaccess logging has several use cases focused around bug detection
> tools:
> 
> 1) Userspace memory safety tools such as ASan, MSan, HWASan and tools
>     making use of the ARM Memory Tagging Extension (MTE) need to monitor
>     all memory accesses in a program so that they can detect memory
>     errors. For accesses made purely in userspace, this is achieved
>     via compiler instrumentation, or for MTE, via direct hardware
>     support. However, accesses made by the kernel on behalf of the
>     user program via syscalls (i.e. uaccesses) are invisible to these
>     tools. With MTE there is some level of error detection possible in
>     the kernel (in synchronous mode, bad accesses generally result in
>     returning -EFAULT from the syscall), but by the time we get back to
>     userspace we've lost the information about the address and size of the
>     failed access, which makes it harder to produce a useful error report.
> 
>     With the current versions of the sanitizers, we address this by
>     interposing the libc syscall stubs with a wrapper that checks the
>     memory based on what we believe the uaccesses will be. However, this
>     creates a maintenance burden: each syscall must be annotated with
>     its uaccesses in order to be recognized by the sanitizer, and these
>     annotations must be continuously updated as the kernel changes. This
>     is especially burdensome for syscalls such as ioctl(2) which have a
>     large surface area of possible uaccesses.
> 
> 2) Verifying the validity of kernel accesses. This can be achieved in
>     conjunction with the userspace memory safety tools mentioned in (1).
>     Even a sanitizer whose syscall wrappers have complete knowledge of
>     the kernel's intended API may vary from the kernel's actual uaccesses
>     due to kernel bugs. A sanitizer with knowledge of the kernel's actual
>     uaccesses may produce more accurate error reports that reveal such
>     bugs.
> 
>     An example of such a bug, which was found by an earlier version of this
>     patch together with a prototype client of the API in HWASan, was fixed
>     by commit d0efb16294d1 ("net: don't unconditionally copy_from_user
>     a struct ifreq for socket ioctls"). Although this bug turned out to
>     relatively harmless, it was a bug nonetheless and it's always possible
>     that more serious bugs of this sort may be introduced in the future.
> 
> 3) Kernel fuzzing. We may use the list of reported kernel accesses to
>     guide a kernel fuzzing tool such as syzkaller (so that it knows which
>     parts of user memory to fuzz), as an alternative to providing the tool
>     with a list of syscalls and their uaccesses (which again thanks to
>     (2) may not be accurate).
> 
> All signals except SIGKILL and SIGSTOP are masked for the interval
> between the prctl() and the next syscall in order to prevent handlers
> for intervening asynchronous signals from issuing syscalls that may
> cause uaccesses from the wrong syscall to be logged.

Stupid question: can this be exploited from user space to effectively 
disable SIGKILL for a long time ... and do we care?

Like, the application allocates a bunch of memory, issues the prctl() 
and spins in user space. What would happen if the OOM killer selects 
this task as a target and does a do_send_sig_info(SIGKILL, 
SEND_SIG_PRIV, ...) ?

-- 
Thanks,

David / dhildenb




More information about the linux-arm-kernel mailing list