[PATCH] ARM: mm: Disregard user space addresses in BUG() address check

Russell King (Oracle) linux at armlinux.org.uk
Fri Jan 19 04:14:25 PST 2024


On Fri, Jan 19, 2024 at 12:52:21PM +0100, Ard Biesheuvel wrote:
> On Thu, 18 Jan 2024 at 21:15, Kees Cook <keescook at chromium.org> wrote:
> >
> > On Wed, Jan 17, 2024 at 04:07:34PM +0100, Ard Biesheuvel wrote:
> > > From: Ard Biesheuvel <ardb at kernel.org>
> > >
> > > is_valid_bugaddr() dereferences the faulting PC to fetch the instruction
> > > that triggered the fault, to decide whether it is a BRK instruction used
> > > to force an exception. This is used by the BUG() infrastructure to keep
> > > the handling logic (which should never execute) separate from the code
> > > that normally runs.
> > >
> > > This dereference may attempt to access user memory if the faulting PC
> > > happens to contain a user address. One way this might happen is when
> > > the kernel is tricked into executing from user space while PAN
> > > protections (Privileged Access Never) are in effect: the instruction
> > > fetch will trigger a prefetch abort, the handling of which involves a
> > > check whether the instruction that caused it is a BRK, requiring a
> > > load from the same address. This load is privileged too, and so it will
> > > trigger another exception, which we fail to recover from.
> > >
> > > Given that BRK instructions tied to BUG() handling can only appear in
> > > kernel code, let's check first that the PC actually points into kernel
> > > memory.
> > >
> > > Cc: Kees Cook <keescook at chromium.org>
> > > Cc: Russell King <rmk+kernel at armlinux.org.uk>
> > > Cc: Mark Brown <broonie at kernel.org>
> > > Cc: Zhen Lei <thunder.leizhen at huawei.com>
> > > Cc: Linus Walleij <linus.walleij at linaro.org>
> > > Link: https://lkml.kernel.org/r/202401111544.18EBB6AA%40keescook
> > > Signed-off-by: Ard Biesheuvel <ardb at kernel.org>
> > > ---
> > >  arch/arm/kernel/traps.c | 3 +++
> > >  1 file changed, 3 insertions(+)
> > >
> > > diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
> > > index 3bad79db5d6e..f342bd6b2a5d 100644
> > > --- a/arch/arm/kernel/traps.c
> > > +++ b/arch/arm/kernel/traps.c
> > > @@ -402,6 +402,9 @@ int is_valid_bugaddr(unsigned long pc)
> > >       u32 insn = __opcode_to_mem_arm(BUG_INSTR_VALUE);
> > >  #endif
> > >
> > > +     if (pc < TASK_SIZE)
> > > +             return 0;
> > > +
> > >       if (get_kernel_nofault(bkpt, (void *)pc))
> > >               return 0;
> > >
> >
> > Okay, I've finally had a chance to test this and it doesn't fix it
> > for me. I still see the double domain fault, and I still lose a CPU.
> > I patched in a pr_info() to show that the "pc < TASK_SIZE" was hitting
> > (which it is) so I see:
> >
> >
> > kdtm: Performing direct entry EXEC_USERSPACE
> > lkdtm: attempting ok execution at 80761b4c
> > lkdtm: attempting bad execution at 76ff4000
> > Unhandled prefetch abort: page domain fault (0x01b) at 0x76ff4000
> > Skipping is_valid_bugaddr(76ff4000)
> > Internal error: : 1b [#12] SMP ARM
> > Modules linked in:
> > ...
> > Process cat (pid: 1301, stack limit = 0x3c8ec418)
> > Stack: (0xf0975e58 to 0xf0976000)
> > ...
> > Backtrace:
> >  lkdtm_EXEC_USERSPACE from lkdtm_do_action+0x2c/0x4c
> > ...
> > Exception stack(0xf0975fa8 to 0xf0975ff0)
> > ...
> > 8<--- cut here ---
> > Unhandled fault: page domain fault (0x01b) at 0x76ff3ff0
> > [76ff3ff0] *pgd=44baa835, *pte=00000000, *ppte=00000000
> > Internal error: : 1b [#13] SMP ARM
> > Modules linked in:
> > ...
> > Process cat (pid: 1301, stack limit = 0x3c8ec418)
> > Stack: (0xf0975ce0 to 0xf0976000)
> > ...
> > Backtrace:
> >  copy_from_kernel_nofault from dump_instr+0x1cc/0x208
> >  r7:f0975d27 r6:80ddf3d8 r5:f0975e08 r4:fffffffc
> >  dump_instr from die+0x294/0x2f0
> 
> This is indeed a similar but different issue (as you conclude below).
> dump_instr() attempts to dereference PC to generate the 'Code: xxxx
> xxxx' line with the opcodes leading up to the fault, and this triggers
> a PAN fault for the same reason.

Both of these have faulted in copy_from_kernel_nofault(), so rather than
adding these tests all over the place, wouldn't it be better to
implement copy_from_kernel_nofault_allowed() so that we verify that the
source is actually something we can copy from?

E.g.

bool copy_from_kernel_nofault_allowed(const void *src, size_t size)
{
	unsigned long srcv = (unsigned long)src;

	return src >= TASK_SIZE && src + size - 1 >= TASK_SIZE;
}

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!



More information about the linux-arm-kernel mailing list