[PATCH V3 7/7] arm64/hw_breakpoint: Enable FEAT_Debugv8p9

Rob Herring robh at kernel.org
Fri Apr 10 12:55:55 PDT 2026


On Thu, Apr 9, 2026 at 5:52 AM Mark Rutland <mark.rutland at arm.com> wrote:
>
> On Thu, Apr 02, 2026 at 12:46:56PM -0500, Rob Herring wrote:
> > On Thu, Apr 2, 2026 at 5:37 AM Mark Rutland <mark.rutland at arm.com> wrote:
> > >
> > > On Tue, Mar 31, 2026 at 05:58:00PM -0500, Rob Herring wrote:
> > > > On Mon, Dec 16, 2024 at 10:58:29AM +0000, Mark Rutland wrote:
> > > > > On Mon, Dec 16, 2024 at 09:38:31AM +0530, Anshuman Khandual wrote:
> > > That said, the use of 'bp_per_reg' looks suspect given their
> > > arch_install_hw_breakpoint() and arch_uninstall_hw_breakpoint() modify
> > > that non-atomically.
> >
> > You don't believe the comment saying counter->ctx->lock is held?
>
> Sorry, my concern here was that hw_breakpoint_handler() (which cannot
> hold the lock) consumes bp_per_reg[], and could race with the non-atomic
> modification in arch_install_hw_breakpoint() or
> arch_uninstall_hw_breakpoint().
>
> I've sent a more elaborate mail to x86 folk, with that and another issue
> caused by taking a breakpoint under arch_uninstall_hw_breakpoint():
>
>   https://lore.kernel.org/lkml/adZWmPW8S9Y2pwkv@J2N7QTR9R3.cambridge.arm.com/
>
> I think we have a similar latent issue where we can take an breakpoint
> or watchpoint ad infinitum, described in more detail at the end of this
> mail.
>
> [...]
>
> > > > > | What prevents a race with an exception handler? e.g.
> > > > > |
> > > > > | * Does the structure of the code prevent that somehow?
> > > >
> > > > If you can't set a breakpoint/watchpoint in NOKPROBE_SYMBOL() annotated
> > > > code, you can't race.
> > >
> > > As above, I agree (with caveats), but I couldn't spot where this is
> > > enforced.
> > >
> > > > However, there's no such annotation for data. It looks like the kernel
> > > > policy is "don't do that" or disable all breakpoints/watchpoints.
> > >
> > > If we have to transiently disable watchpoints/breakpoints when
> > > manipulating the relevant HW registers, that sounds fine to me.
> >
> > For wp/bp_on_reg, the ordering is 'data access, h/w accesses'. I think
> > we just need a barrier to enforce that ordering so the data access
> > (and then watchpoint) don't trigger in the middle of the h/w accesses.
>
> I assume that by 'h/w accesses' you mean MSRs to the system registers
> controlling breakpoints/watchpoints. Ordering-wise, I don't believe
> memory barriers are necessary here (explain in more detail below).
> However, I also think there's a latent issue here that might bite us
> with the new banking, described at the end of this mail.
>
> Both breakpoint and watchpoint exceptions are synchronous, meanning that
> they can only be taken from the specific instruction that triggered
> them.  However, updates to the watchpoint control registers *do* need a
> context synchronization event before they're guarnateed to take effect.
>
> For a sequence:
>
>     // Initially:
>     // - MDSCR, MDCR, DAIF.D permit debug exceptions at CurrentEL
>     // - No watchpoints enabled
>
>     0x000:  LDR <val>, [<addr>]
>     0x004:  MSR DBGWVR<n>_EL1, <addr>
>     0x008:  MSR DBGWCR<n>_EL1, <configure_and_enable>
>     0x010:  LDR <val>, [<addr>]
>     0x014:  ISB
>     0x018:  LDR <val>, [<addr>]
>
> ... we know:
>
>     (a) The LDR at 0x000 *cannot* trigger the watchpoint.

Why not? Can't the LDR complete after the MSR? Is ordering ensured
between those? Surely the watchpoint triggers on completion of the
load and that wouldn't stall the PE from doing the MSR(s)?

>     (b) The LDR at 0x010 *might* trigger the matchpoint.
>     (c) The LDR at 0x018 *must* trigger the watchpoint.
>
> For C code, we can enforce this order with barrier(), e.g.
>
>         val = *addr;
>         barrier();
>         write_sysreg(addr, DBGWVR<n>_EL1);
>         write_sysreg(configure_and_enable, DBGWCR<n>_EL1);
>         isb();
>
> ... where the compiler cannot re-order the memory access (or
> write_sysreg(), or isb()) across the barrier(), and as isb() has a
> memory clobber, the same is true for isb().
>
> Likewise, for the inverse sequence:
>
>     // Initially:
>     // - MDSCR, MDCR, DAIF.D permit debug exceptions at CurrentEL
>     // - Watchpoint configured and enabled for <addr>
>
>     0x100:  LDR <val>, [<addr>]
>     0x104:  MSR DBGWCR<n>_EL1, <disable>
>     0x108:  LDR <val>, [<addr>]
>     0x110:  ISB
>     0x114:  LDR <val>, [<addr>]
>
> ... we know:
>
>     (a) The LDR at 0x100 *must* trigger the watchpoint.
>     (b) The LDR at 0x108 *might* trigger the watchpoint.
>     (c) The LDR at 0x114 *cannot* trigger the watchpoint.
>
> > Any guidance on the flavor of dsb here? (And is there any guarantee
> > that the access is visible to the watchpoint h/w after a dsb
> > completes?)
>
> Hopefully the above was sufficient?
>
> As mentioned above, I think we have a latent issue where we can take a
> breakpoint or watchpoint under arch_uninstall_hw_breakpoint(), where we
> have:
>
>     arch_uninstall_hw_breakpoint(bp) {
>         ...
>         hw_breakpoint_control(bp, HW_BREAKPOINT_UNINSTALL) {
>             ...
>             hw_breakpoint_slot_setup(slots, max_slots, bp, HW_BREAKPOINT_UNINSTALL) {
>                  ...
>                  *slot = NULL;
>                  ...
>             }
>             ...
>             write_wb_reg(ctrl_reg, i, 0) {
>                 ...
>                 write_sysreg(0, ...);
>                 isb();
>                 ...
>             }
>         }
>     }
>
> The HW breakpoint/watchpoint associated with 'bp' could be triggered
> between setting '*slot' to NULL and the ISB. If that happens, then
> do_breakpoint() won't find 'bp', and will return *without* disabling the
> HW breakpoint or attempting to step.
>
> If that first exception was taken *before* the MSR in write_sysreg(),
> then nothing has changed, and the breakpoint/watchpoint will then be
> taken again ad infinitum.
>
> If that first exception was taken *after* the MSR in write_sysreg(), the
> context synchronization provided by exception entry/return will prevent
> it from being taken again.
>
> Building v6.19 and testing (with pseudo-NMI enabled):
>
> | #  grep write_wb_reg /proc/kallsyms
> | ffff80008004b980 t write_wb_reg
> | # ./perf-6.19 stat -a -C 1 -e 'mem:0xffff80008004b980/4:xk' true

I'll give it a try with v4, but that should be prevented with my
changes to exclude kprobe regions.

Rob



More information about the linux-arm-kernel mailing list