[PATCH] arm64: hw_breakpoint: Handle inexact watchpoint addresses
Pavel Labath
labath at google.com
Fri Sep 16 03:07:39 PDT 2016
Hello Will,
thank you for the review. I will be a bit busy in the next few weeks,
so I don't think I'll be able to submit a revised version of the patch
that quickly. Until then, here I my responses.
On 16 September 2016 at 09:58, Will Deacon <will.deacon at arm.com> wrote:
> Hi Pavel,
>
> On Mon, Sep 12, 2016 at 03:07:24PM +0100, Pavel Labath wrote:
>> Arm64 hardware does not always report a watchpoint hit address that
>> matches one of the watchpoints set. It can also report an address
>> "near" the watchpoint if a single instruction access both watched and
>> unwatched addresses. There is no straight-forward way, short of
>> disassembling the offending instruction, to map that address back to
>> the watchpoint.
>>
>> Previously, when the hardware reported a watchpoint hit on an address
>> that did not match our watchpoint (this happens in case of instructions
>> which access large chunks of memory such as "stp") the process would
>> enter a loop where we would be continually resuming it (because we did
>> not recognise that watchpoint hit) and it would keep hitting the
>> watchpoint again and again. The tracing process would never get
>> notified of the watchpoint hit.
>>
>> This commit fixes the problem by looking at the watchpoints near the
>> address reported by the hardware. If the address does not exactly match
>> one of the watchpoints we have set, it attributes the hit to the
>> nearest watchpoint we have. This heuristic is a bit dodgy, but I don't
>> think we can do much more, given the hardware limitations.
>> I include a kernel selftest which triggers this code.
>
> Thanks for the patch. Comments inline.
>
>> Signed-off-by: Pavel Labath <labath at google.com>
>> ---
>> arch/arm64/kernel/hw_breakpoint.c | 104 +++++++---
>> tools/testing/selftests/breakpoints/Makefile | 5 +-
>> .../selftests/breakpoints/breakpoint_test-arm.c | 217 +++++++++++++++++++++
>> 3 files changed, 298 insertions(+), 28 deletions(-)
>> create mode 100644 tools/testing/selftests/breakpoints/breakpoint_test-arm.c
>
> Could you split the test out into a separate patch please?
Sure thing.
>
>> diff --git a/arch/arm64/kernel/hw_breakpoint.c b/arch/arm64/kernel/hw_breakpoint.c
>> index 26a6bf7..d0ebfe6 100644
>> --- a/arch/arm64/kernel/hw_breakpoint.c
>> +++ b/arch/arm64/kernel/hw_breakpoint.c
>> @@ -661,50 +661,80 @@ unlock:
>> }
>> NOKPROBE_SYMBOL(breakpoint_handler);
>>
>> +/*
>> + * Arm64 hardware does not always report a watchpoint hit address that matches
>> + * one of the watchpoints set. It can also report an address "near" the
>> + * watchpoint if a single instruction access both watched and unwatched
>> + * addresses. There is no straight-forward way, short of disassembling the
>> + * offending instruction, to map that address back to the watchpoint. This
>> + * function computes the distance of the memory access from the watchpoint as a
>> + * heuristic for the likelyhood that a given access triggered the watchpoint.
>> + *
>> + * See Section D2.10.5 "Determining the memory location that caused a Watchpoint
>> + * exception" of ARMv8 Architecture Reference Manual for details.
>> + *
>> + * The function returns the distance of the address from the bytes watched by
>> + * the watchpoint. In case of an exact match, it returns 0.
>> + */
>> +static u64 get_distance_from_watchpoint(unsigned long addr, int i,
>> + struct arch_hw_breakpoint *info)
>> +{
>> + u64 val, alignment_mask, wp_low, wp_high;
>> + u32 ctrl_reg;
>> + int first_bit;
>> + struct arch_hw_breakpoint_ctrl ctrl;
>> +
>> + /* AArch32 watchpoints are either 4 or 8 bytes aligned. */
>> + if (is_compat_task()) {
>> + if (info->ctrl.len == ARM_BREAKPOINT_LEN_8)
>> + alignment_mask = 0x7;
>> + else
>> + alignment_mask = 0x3;
>> + } else {
>> + alignment_mask = 0x7;
>> + }
>
> We have identical logic elsewhere in this file, so please stick this into
> a get_watchpoint_alignment_mask function.
I presume you mean the code in arch_validate_hwbkpt_settings. I'll put
that up as a separate patch as well, as it does not seem to be a
completely trivial cut'n'paste job.
>
>> +
>> + val = read_wb_reg(AARCH64_DBG_REG_WVR, i) & ~alignment_mask;
>> +
>> + ctrl_reg = read_wb_reg(AARCH64_DBG_REG_WCR, i);
>> + decode_ctrl_reg(ctrl_reg, &ctrl);
>> + first_bit = ffs(ctrl.len);
>> + if (first_bit == 0)
>> + return -1;
>> + wp_low = val + first_bit - 1;
>> + wp_high = val + fls(ctrl.len) - 1;
>> + if (addr < wp_low)
>> + return wp_low - addr;
>> + else if (addr > wp_high)
>> + return addr - wp_high;
>
> Why do you need to read the control register? We already have the length
> in the arch_hw_breakpoint, and that should be sufficient for the check, no?
I did this because the original code (see below) relied on reading the
register instead of struct arch_hw_breakpoint. However, if you believe
that is ok, I can do it that way. It will certainly make this code
less complicated.
>
>> static int watchpoint_handler(unsigned long addr, unsigned int esr,
>> struct pt_regs *regs)
>> {
>> - int i, step = 0, *kernel_step, access;
>> - u32 ctrl_reg;
>> - u64 val, alignment_mask;
>> + int i, step = 0, *kernel_step, access, closest_match;
>> + u64 min_dist = -1, dist;
>> struct perf_event *wp, **slots;
>> struct debug_info *debug_info;
>> struct arch_hw_breakpoint *info;
>> - struct arch_hw_breakpoint_ctrl ctrl;
>>
>> slots = this_cpu_ptr(wp_on_reg);
>> debug_info = ¤t->thread.debug;
>>
>> + /*
>> + * Find all watchpoints that match the reported address. If no exact
>> + * match is found. Attribute the hit to the closest watchpoint.
>> + */
>> for (i = 0; i < core_num_wrps; ++i) {
>> rcu_read_lock();
>>
>> wp = slots[i];
>> -
>> if (wp == NULL)
>> goto unlock;
>>
>> - info = counter_arch_bp(wp);
>> - /* AArch32 watchpoints are either 4 or 8 bytes aligned. */
>> - if (is_compat_task()) {
>> - if (info->ctrl.len == ARM_BREAKPOINT_LEN_8)
>> - alignment_mask = 0x7;
>> - else
>> - alignment_mask = 0x3;
>> - } else {
>> - alignment_mask = 0x7;
>> - }
>> -
>> - /* Check if the watchpoint value matches. */
>> - val = read_wb_reg(AARCH64_DBG_REG_WVR, i);
>> - if (val != (addr & ~alignment_mask))
>> - goto unlock;
>> -
>> - /* Possible match, check the byte address select to confirm. */
>> - ctrl_reg = read_wb_reg(AARCH64_DBG_REG_WCR, i);
>> - decode_ctrl_reg(ctrl_reg, &ctrl);
>> - if (!((1 << (addr & alignment_mask)) & ctrl.len))
>> - goto unlock;
Original code reading the control register.
>> -
>> /*
>> * Check that the access type matches.
>> * 0 => load, otherwise => store
>> @@ -714,6 +744,17 @@ static int watchpoint_handler(unsigned long addr, unsigned int esr,
>> if (!(access & hw_breakpoint_type(wp)))
>> goto unlock;
>>
>> + info = counter_arch_bp(wp);
>> +
>> + dist = get_distance_from_watchpoint(addr, i, info);
>> + if (dist < min_dist) {
>> + min_dist = dist;
>> + closest_match = i;
>> + }
>> + /* Is this an exact match? */
>> + if (dist != 0)
>> + goto unlock;
>> +
>> info->trigger = addr;
>> perf_bp_event(wp, regs);
>>
>> @@ -724,6 +765,15 @@ static int watchpoint_handler(unsigned long addr, unsigned int esr,
>> unlock:
>> rcu_read_unlock();
>> }
>> + if (min_dist > 0 && min_dist != -1) {
>> + /* No exact match found. */
>> + rcu_read_lock();
>> + wp = slots[closest_match];
>> + info = counter_arch_bp(wp);
>> + info->trigger = addr;
>> + perf_bp_event(wp, regs);
>> + rcu_read_unlock();
>
> Hmm, are you sure it's safe to drop and retake the RCU lock here? If the
> event is destroyed before we retake the RCU lock, won't htis go horribly
> wrong? It's probably best to hold the lock outside the loop :(
I am definitely not sure of that. :) It's not even clear to me what
data does this lock protect. Doing the locking outside the loop would
certainly be safer though.
regards,
pavel
More information about the linux-arm-kernel
mailing list