[next] arm: Internal error: Oops: 5 PC is at __read_once_word_nocheck

Ard Biesheuvel ardb at kernel.org
Wed Mar 9 07:10:20 PST 2022


On Wed, 9 Mar 2022 at 16:07, Russell King (Oracle)
<linux at armlinux.org.uk> wrote:
>
> On Wed, Mar 09, 2022 at 03:57:32PM +0100, Ard Biesheuvel wrote:
> > On Wed, 9 Mar 2022 at 15:44, Naresh Kamboju <naresh.kamboju at linaro.org> wrote:
> > >
> > > On Wed, 9 Mar 2022 at 19:37, Naresh Kamboju <naresh.kamboju at linaro.org> wrote:
> > > >
> > > > On Wed, 9 Mar 2022 at 16:16, Ard Biesheuvel <ardb at kernel.org> wrote:
> > > > >
> > > > > On Wed, 9 Mar 2022 at 11:37, Russell King (Oracle)
> > > > > <linux at armlinux.org.uk> wrote:
> > > > > >
> > > > > > On Wed, Mar 09, 2022 at 03:18:12PM +0530, Naresh Kamboju wrote:
> > > > > > > While boting linux next-20220308 on BeagleBoard-X15 and qemu arm the following
> > > > > > > kernel crash reported which is CONFIG_KASAN enabled build [1] & [2].
> > > > > >
> > > > > > The unwinder is currently broken in linux-next. Please try reverting
> > > > > > 532319b9c418 ("ARM: unwind: disregard unwind info before stack frame is
> > > > > > set up")
> > >
> > > I have reverted the suggested commit and built and boot failed due to reported
> > > kernel crash [1].
> > >
> > > - Naresh
> > >
> >
> > Thanks Naresh,
> >
> > This looks like it might be related to the issue Russell just sent a fix for:
> > https://lore.kernel.org/linux-arm-kernel/CAMj1kXEqp2UmsyUe1eWErtpMk3dGEFZyyno3nqydC_ML0bwTLw@mail.gmail.com/T/#t
> >
> > Could you please try that?
>
> Well, we unwound until:
>
>  __irq_svc from migrate_disable+0x0/0x70
>
> and then crashed - and the key thing there is that we're at the start
> of migrate_disable() when we took an interrupt.
>
> For some reason, this triggers an access to address 0x10, which faults.
> We then try unwinding again, and successfully unwind all the way back
> to the same point (the line above) which then causes the unwinder to
> again access address 0x10, and the cycle repeats with the stack
> growing bigger and bigger.
>
> I'd suggest also testing without the revert but with my patch.
>

Indeed.

And as I suggested the other day, maybe it wouldn't be so bad to
harden the vsp dereference, like below:

--- a/arch/arm/kernel/unwind.c
+++ b/arch/arm/kernel/unwind.c
@@ -27,6 +27,7 @@
 #include <linux/sched.h>
 #include <linux/slab.h>
 #include <linux/spinlock.h>
+#include <linux/uaccess.h>
 #include <linux/list.h>

 #include <asm/sections.h>
@@ -236,10 +237,11 @@ static int unwind_pop_register(struct
unwind_ctrl_block *ctrl,
                if (*vsp >= (unsigned long *)ctrl->sp_high)
                        return -URC_FAILURE;

-       /* Use READ_ONCE_NOCHECK here to avoid this memory access
-        * from being tracked by KASAN.
+       /* Use get_kernel_nofault() here to avoid this memory access
+        * from causing a fatal fault, and from being tracked by KASAN.
         */
-       ctrl->vrs[reg] = READ_ONCE_NOCHECK(*(*vsp));
+       if (get_kernel_nofault(ctrl->vrs[reg], *vsp))
+               return -URC_FAILURE;
        if (reg == 14)
                ctrl->lr_addr = *vsp;
        (*vsp)++;



More information about the linux-arm-kernel mailing list