CPU stalls when handling PAN emulation?

Tue Jan 16 03:04:15 PST 2024

On Fri, 12 Jan 2024 at 01:11, Kees Cook <keescook at chromium.org> wrote:
>
> Hi,
>
> Mark noticed that LKDTM's EXEC_USERSPACE test[1] (which trips the PAN
> emulation, CONFIG_CPU_SW_DOMAIN_PAN=y, on 32-bit arm) appears to be
> creating a situation that leads to a CPU stall. make_task_dead() reports:
>
>         note: ...[NNN] exited with preempt_count 1
>
> which isn't seen with other Oopses (like EXEC_RODATA). (They do also
> both report: "note: ...[NNN] exited with irqs disabled" too, but this
> seems survivable.) I note that the ACCESS_USERSPACE test does _not_
> have this problem.
>
> ACCESS_USERSPACE (survivable) starts with:
>
>         lkdtm: Performing direct entry ACCESS_USERSPACE
>         lkdtm: attempting bad read at 76f44000
>         8<--- cut here ---
>         Unhandled fault: page domain fault (0x01b) at 0x76f44000
>
> EXEC_USERSPACE (leads to CPU stall) starts with:
>
>         lkdtm: Performing direct entry EXEC_USERSPACE
>         lkdtm: attempting ok execution at 8075bf18
>         lkdtm: attempting bad execution at 76f6f000
>         Unhandled prefetch abort: page domain fault (0x01b) at 0x76f6f000
>         8<--- cut here ---
>         Unhandled fault: page domain fault (0x01b) at 0x76f6f000
>
> So they're both getting caught by the Domain stuff, but there looks to
> be a second fault for EXEC_USERSPACE. (more below)
>
> For the CPU stall to appear there (at least) needs to be a second Oops.
> As an example, if I run EXEC_USERSPACE and then EXEC_RODATA, the latter
> stops sending to the console very quickly, reporting only the very start
> of the Oops:
>
>         lkdtm: Performing direct entry EXEC_USERSPACE
>         lkdtm: attempting ok execution at 8075bf18
>         lkdtm: attempting bad execution at 76f10000
>         Unhandled prefetch abort: page domain fault (0x01b) at 0x76f10000
>         8<--- cut here ---
>         Unhandled fault: page domain fault (0x01b) at 0x76f10000
>         [76f10000] *pgd=44f6e835, *pte=469a455f, *ppte=469a4c7e
>         Internal error: : 1b [#1] SMP ARM
>         Modules linked in:
>         ...
>         Stack: (0xf0959d10 to 0xf095a000)
>         ...
>          copy_from_kernel_nofault from is_valid_bugaddr+0x40/0x84


This appears to be the BUG() handling attempting to load the faulting
opcode, and triggering an additional domain fault because the kernel
is not permitted to read from user space either.

Something like the below should work around it.

--- a/arch/arm/kernel/traps.c
+++ b/arch/arm/kernel/traps.c
@@ -402,6 +402,9 @@ int is_valid_bugaddr(unsigned long pc)
        u32 insn = __opcode_to_mem_arm(BUG_INSTR_VALUE);
 #endif

+       if (pc < TASK_SIZE)
+               return 0;
+
        if (get_kernel_nofault(bkpt, (void *)pc))
                return 0;