Supporting KVM_GUESTDBG_BLOCKIRQ or something similar on ARM64

Tue Oct 29 06:57:53 PDT 2024

On Tue, Oct 29, 2024 at 11:00:24AM +0100, Ard Biesheuvel wrote:
> On Tue, 29 Oct 2024 at 10:53, Marc Zyngier <maz at kernel.org> wrote:
> >
> > On Tue, 29 Oct 2024 08:52:41 +0000,
> > Ard Biesheuvel <ardb at kernel.org> wrote:
> > >
> > > On Mon, 28 Oct 2024 at 12:23, Marc Zyngier <maz at kernel.org> wrote:
> > > >
> > > > > Let's start a discussion about what needs to be done to support this on
> > > > > arm64.
> > > >
> > > > A good start would be to define the semantics of such a flag:
> > > >
> > > > - what should it affect? the vcpu you are single-stepping? all vcpu?
> > > >
> > > > - should userspace to know that interrupts are pending?
> > > >
> > > > - should this result in any effect on the guest's view of time?
> > > >
> > > > - what of interactions on the rest of the system (such as devices)?
> > > >
> > >
> > > Sorry to give a handwavy answer here, but approaching this from a
> > > usability PoV (like what Puranjay is doing), it is really about
> > > adhering to the principle of least surprise for the user.
> > >
> > > So in that sense, it is not really about blocking IRQs at all, as long
> > > as we step over them rather than into them. How that is achieved is
> > > not that relevant from the user PoV, and maybe KVM_GUESTDBG_BLOCKIRQ
> > > is not the right solution for ARM at all.
> >
> > I definitely sympathise with the goal, but there is no simple way to
> > let interrupts through while stepping (which is what your "step over"
> > implies):
> >
> > - the hypervisor (in general) doesn't interact with the guest delivery
> >   and handling of interrupts -- this is either very opaque (list
> >   registers) or completely invisible (direct injection)
> >
> > - replacing the step with a breakpoint after the stepped instruction
> >   requires us to decode the guest instructions to handle branching
> >   effects
> >
> 
> Yeah, and we still want to take non-IRQ/FIQ exceptions, so this does
> not seem feasible to me.
> 
> > One possible mechanism would be to:
> >
> > - while stepping, add breakpoints to the interrupt vectors for the EL
> >   we are stepping (3 breakpoints for any of the 4 possible exception
> >   groups),
> >
> > - when any interrupt breakpoint hits, clear all 3, place a breakpoint
> >   on the instruction that was about to be single-stepped (pointed to
> >   by SPSR)
> >
> > - run to completion, until the breakpoint hits
> >
> > - disable the breakpoint, reinstall the previous 3 interrupt
> >   breakpoints
> >
> > - single-step, rinse, repeat
> >
> > But then I'm asking myself the question: why is this KVM's job? It
> > seems to me that this is what an external debugger would do when
> > interacting with HW on bare metal.
> >
> > So can we implement this as part of the debugger's state machine?
> >
> 
> Which debugger is that? The GDB stub in QEMU?
> 
> Setting a one-shot breakpoint on the address in SPSR when taking an
> IRQ exception seems like a reasonable approach to me.

That doesn't work; an IRQ could be taken in the middle of a common
helper that's also used in IRQ context, so you'd take the breakpoint
within the IRQ. You could try to match a bunch of things like the SP and
so on, but that boils to do a bunch of heuristics rather than something
that's guarnateed to work...

More generally, the IRQ can preempt the running thread anyway, so:

* The user cannot use this to trace a kernel thread reliabl , since that
  can be switched out behind their back.

* The user cannot use this to trace a CPU regardless of the running
  thread, since they lose anything that happens under an IRQ.

Mark.