[PATCH v3 1/1] kernel: kprobes: fix cur_kprobe corruption during re-entrant kprobe_busy_begin() calls

Khaja Hussain Shaik Khaji khaja.khaji at oss.qualcomm.com
Mon Apr 20 00:05:52 PDT 2026


On Mon, Mar 02, 2026 at 01:38:35PM +0000, Mark Rutland wrote:
> That suggests that something is going wrong *within* your entry handler
> that causes IRQs to be unmasked unexpectedly.
>
> Please can we find out *exactly* where IRQs get unmasked for the first
> time?

Thanks for the pointer -- that was the right direction to look.

You are correct. I confirmed that arm64_enter_el1_dbg() does NOT re-enable
IRQs; it only manages lockdep and context-tracking state. The IRQ unmask
originates entirely within our kretprobe entry_handler itself.

The exact call chain is:

pre_handler_kretprobe()
  entry_dwc3_gadget_pullup()          <- kretprobe entry_handler
    dwc3_msm_notify_event()
      _raw_spin_unlock_irq()          <- first IRQ unmask (spin_unlock_irq)

dwc3_msm_notify_event() is called from within the entry_handler while
holding a spinlock acquired with spin_lock_irq() (i.e. IRQs were disabled
on lock, and re-enabled unconditionally on unlock via spin_unlock_irq /
_raw_spin_unlock_irq). This is the first point at which IRQs become
unmasked.

>From that point, a hardware IRQ fires, softirq processing runs, and
kprobe_flush_task() -> kprobe_busy_begin()/end() is invoked while the
kretprobe entry_handler is still on the stack -- triggering the cur_kprobe
corruption described in the patch.

Regarding documentation: the kprobes documentation in
Documentation/trace/kprobes.rst (section "Kretprobe entry-handler") does
not mention any restriction on enabling IRQs within an entry_handler. The
only constraint documented is:

  "Probe handlers are run with preemption disabled or interrupt disabled,
   which depends on the architecture and optimization state."

This is stated for kprobe/kretprobe handlers in general, but there is no
explicit warning that an entry_handler must not re-enable IRQs for arm64.
Given that entry_handlers are user-supplied callbacks, a note
here would help future users avoid this class of bug.

As for the fix itself: we plan to carry this as a downstream patch for our
platform. We are not planning to push it upstream at this time.

Thanks again for the detailed review.

Khaja



More information about the linux-arm-kernel mailing list