[RFC PATCH v1] riscv: support for hardware breakpoints/watchpoints

Sat Nov 5 02:10:34 PDT 2022

On Sat, Nov 5, 2022 at 3:07 AM Sergey Matyukevich <geomatsi at gmail.com> wrote:
>
> Hi Andrew,
>
> > > RISC-V Debug specification includes Sdtrig ISA extension. This extension
> > > describes Trigger Module. Triggers can cause a breakpoint exception,
> > > entry into Debug Mode, or a trace action without having to execute a
> > > special instruction. For native debugging triggers can be used to
> > > implement hardware breakpoints and watchpoints.
>
> ... [snip]
>
> > > Despite missing userspace debug, initial implementation can be tested
> > > on QEMU using kernel breakpoints, e.g. see samples/hw_breakpoint and
> > > register_wide_hw_breakpoint. Hardware breakpoints work on upstream QEMU.
> >
> > We should also be able to enable the use of HW breakpoints (and
> > watchpoints, modulo the issue mentioned below) in kdb, right?
>
> Interesting. So far I didn't think about using hw breakpoints in kgdb.
> I took a quick look at riscv and arm64 kgdb code. It looks like there
> is nothing wrong in adding arch-specific implementation of the function
> 'kgdb_arch_set_breakpoint' that will use hw breakpoints if possible.
> Besides it looks like in this case it makes sense to handle KGDB earlier
> than hw breakpoints in do_trap_break.
>
> > > However this is not the case for watchpoints since there is no way to
> > > figure out which watchpoint is triggered. IIUC there are two possible
> > > options for doing this: using 'hit' bit in tdata1 or reading faulting
> > > virtual address from STVAL. QEMU implements neither of them. Current
> > > implementation opts for STVAL. So the following experimental QEMU patch
> > > is required to make watchpoints work:
> > >
> > > :  diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> > > :  index 278d163803..8858be7411 100644
> > > :  --- a/target/riscv/cpu_helper.c
> > > :  +++ b/target/riscv/cpu_helper.c
> > > :  @@ -1639,6 +1639,10 @@ void riscv_cpu_do_interrupt(CPUState *cs)
> > > :           case RISCV_EXCP_VIRT_INSTRUCTION_FAULT:
> > > :               tval = env->bins;
> > > :               break;
> > > :  +        case RISCV_EXCP_BREAKPOINT:
> > > :  +            tval = env->badaddr;
> > > :  +            env->badaddr = 0x0;
> > > :  +            break;
> > > :           default:
> > > :               break;
> > > :           }
> > > :  diff --git a/target/riscv/debug.c b/target/riscv/debug.c
> > > :  index 26ea764407..b4d1d566ab 100644
> > > :  --- a/target/riscv/debug.c
> > > :  +++ b/target/riscv/debug.c
> > > :  @@ -560,6 +560,7 @@ void riscv_cpu_debug_excp_handler(CPUState *cs)
> > > :
> > > :       if (cs->watchpoint_hit) {
> > > :           if (cs->watchpoint_hit->flags & BP_CPU) {
> > > :  +            env->badaddr = cs->watchpoint_hit->hitaddr;
> > > :               cs->watchpoint_hit = NULL;
> > > :               do_trigger_action(env, DBG_ACTION_BP);
> > > :           }
>
> ... [snip]
>
> > > +int arch_install_hw_breakpoint(struct perf_event *bp)
> > > +{
> > > +       struct arch_hw_breakpoint *info = counter_arch_bp(bp);
> > > +       struct sbi_dbtr_data_msg *xmit;
> > > +       struct sbi_dbtr_id_msg *recv;
> > > +       struct perf_event **slot;
> > > +       struct sbiret ret;
> > > +       int err = 0;
> > > +
> > > +       xmit = kzalloc(SBI_MSG_SZ_ALIGN(sizeof(*xmit)), GFP_ATOMIC);
> > > +       if (!xmit) {
> > > +               err = -ENOMEM;
> > > +               goto out;
> > > +       }
> > > +
> > > +       recv = kzalloc(SBI_MSG_SZ_ALIGN(sizeof(*recv)), GFP_ATOMIC);
> > > +       if (!recv) {
> > > +               err = -ENOMEM;
> > > +               goto out;
> > > +       }
> >
> > Do these really need to be dynamically allocated?
>
> According to SBI extension proposal, base address of this memory chunk
> must be 16-bytes aligned. To simplify things, buffer with 'power of two
> bytes' size (and >= 16 bytes) is allocated. In this case alignment of
> the kmalloc buffer is guaranteed to be at least this size. IIUC more
> efforts are needed to guarantee such alignment for a buffer on stack.

Stack is not appropriate for this. Please use a per-CPU global
data for this purpose which should be 16 byte aligned as well.

You may also allocate a per-CPU 4K page at boot-time where
CPU X will use it's own 4K page for xmit (upper half) and
recv (lower half).

Regards,
Anup

>
> > > +
> > > +       xmit->tdata1 = info->trig_data1.value;
> > > +       xmit->tdata2 = info->trig_data2;
> > > +       xmit->tdata3 = info->trig_data3;
> > > +
> > > +       ret = sbi_ecall(SBI_EXT_DBTR, SBI_EXT_DBTR_TRIGGER_INSTALL,
> > > +                       1, __pa(xmit) >> 4, __pa(recv) >> 4,
> > > +                       0, 0, 0);
> > > +       if (ret.error) {
> > > +               pr_warn("%s: failed to install trigger\n", __func__);
> > > +               err = -EIO;
> > > +               goto out;
> > > +       }
> > > +
> > > +       if (recv->idx >= dbtr_total_num) {
> > > +               pr_warn("%s: invalid trigger index %lu\n", __func__, recv->idx);
> > > +               err = -EINVAL;
> > > +               goto out;
> > > +       }
> > > +
> > > +       slot = this_cpu_ptr(&bp_per_reg[recv->idx]);
> > > +       if (*slot) {
> > > +               pr_warn("%s: slot %lu is in use\n", __func__, recv->idx);
> > > +               err = -EBUSY;
> > > +               goto out;
> > > +       }
> > > +
> > > +       *slot = bp;
> > > +
> > > +out:
> > > +       kfree(xmit);
> > > +       kfree(recv);
> > > +
> > > +       return err;
> > > +}
>
> ... [snip]
>
> > > +static int __init arch_hw_breakpoint_init(void)
> > > +{
> > > +       union riscv_dbtr_tdata1 tdata1;
> > > +       struct sbiret ret;
> > > +
> > > +       if (sbi_probe_extension(SBI_EXT_DBTR) <= 0) {
> > > +               pr_info("%s: SBI_EXT_DBTR is not supported\n", __func__);
> > > +               return 0;
> > > +       }
> > > +
> > > +       ret = sbi_ecall(SBI_EXT_DBTR, SBI_EXT_DBTR_NUM_TRIGGERS,
> > > +                       0, 0, 0, 0, 0, 0);
> > > +       if (ret.error) {
> > > +               pr_warn("%s: failed to detect triggers\n", __func__);
> > > +               return 0;
> > > +       }
> > > +
> > > +       pr_info("%s: total number of triggers: %lu\n", __func__, ret.value);
> > > +
> > > +       tdata1.value = 0;
> > > +       tdata1.type = RISCV_DBTR_TRIG_MCONTROL6;
> > > +
> > > +       ret = sbi_ecall(SBI_EXT_DBTR, SBI_EXT_DBTR_NUM_TRIGGERS,
> > > +                       tdata1.value, 0, 0, 0, 0, 0);
> > > +       if (ret.error) {
> > > +               pr_warn("%s: failed to detect triggers\n", __func__);
> > > +               dbtr_total_num = 0;
> > > +               return 0;
> > > +       }
> >
> > nit: This is basically identical to hw_breakpoint_slots() -- just call
> > it here, or perhaps pull the DBTR_NUM_TRIGGERS ECALL into its own
> > function to reduce the duplication, e.g. 'dbtr_num_triggers(unsigned
> > long type)'?
>
> Good point. More similar requests will be added, e.g. for MCONTROL and
> possibly other trigger types. So I will add a separate
> 'dbtr_num_triggers' function.
>
> > > +
> > > +       pr_info("%s: total number of type %d triggers: %lu\n",
> > > +               __func__, tdata1.type, ret.value);
> > > +
> > > +       dbtr_total_num = ret.value;
> > > +
> > > +       return 0;
> > > +}
>
> Thanks!
> Sergey