RISC-V uprobe bug (Was: Re: WARNING: CPU: 3 PID: 261 at kernel/bpf/memalloc.c:342)

Nam Cao namcaov at gmail.com
Sun Aug 27 02:39:00 PDT 2023


On Sun, Aug 27, 2023 at 11:04:34AM +0200, Björn Töpel wrote:
> Nam Cao <namcaov at gmail.com> writes:
> 
> > On Sun, Aug 27, 2023 at 10:11:25AM +0200, Björn Töpel wrote:
> >> The default implementation of is_trap_insn() which RISC-V is using calls
> >> is_swbp_insn(), which is doing what your patch does. Your patch does not
> >> address the issue.
> >
> > is_swbp_insn() does this:
> >
> >         #ifdef CONFIG_RISCV_ISA_C
> >                 return (*insn & 0xffff) == UPROBE_SWBP_INSN;
> >         #else
> >                 return *insn == UPROBE_SWBP_INSN;
> >         #endif
> >
> > ...so it doesn't even check for 32-bit ebreak if C extension is on. My patch
> > is not the same.
> 
> Ah, was too quick.
> 
> AFAIU uprobes *always* uses c.ebreak when CONFIG_RISCV_ISA_C is set, and
> ebreak otherwise. That's the reason is_swbp_insn() is implemented like
> that.

That's what I understand too.

> If that's not the case, there's a another bug, that your patches
> addresses.

I think it's a bug regardless. is_trap_insn() is used by uprobes to figure out
if there is an instruction that generates trap exception, not just instructions
that are "SWBP". The reason is because when there is a trap, but uprobes doesn't
see a probe installed here, it needs is_trap_insn() to figure out if the trap
is generated by ebreak from something else, or because the probe is just removed.
In the latter case, uprobes will return back, because probe has already been removed,
so it should be safe to do so. That's why I think the incorrect is_swbp_insn()
would cause a hang, because uprobes incorrectly thinks there is no ebreak there,
so it should be okay to go back, but there actually is.

So, from my understanding, if uprobes encounter a 32-bit ebreak for any reason,
the kernel would hang. I think your patch is a great addition nonetheless, but I
am guessing that it only masks the problem by preventing uprobes from seeing the
32-bit ebreak in the specific test, not really solve it. So, if there is a 32-bit
ebreak in userspace, the bug still causes the kernel to hang.

I am still quite confident of my logic, so I would be very suprised if my fix
doesn't solve the reported hang. Do you mind testing my patch? My potato of a
laptop unfortunately cannot run the test :(

Best regards,
Nam



More information about the linux-riscv mailing list