[PATCH 4/5] KVM: RISC-V: reset VCPU state when becoming runnable

Wed Apr 30 06:02:31 PDT 2025

On Wed, Apr 30, 2025 at 5:15 PM Radim Krčmář <rkrcmar at ventanamicro.com> wrote:
>
> 2025-04-30T15:47:13+05:30, Anup Patel <anup at brainfault.org>:
> > On Wed, Apr 30, 2025 at 1:59 PM Radim Krčmář <rkrcmar at ventanamicro.com> wrote:
> >> 2025-04-30T10:56:35+05:30, Anup Patel <anup at brainfault.org>:
> >> > On Wed, Apr 30, 2025 at 9:52 AM Anup Patel <anup at brainfault.org> wrote:
> >> >> On Tue, Apr 29, 2025 at 9:51 PM Radim Krčmář <rkrcmar at ventanamicro.com> wrote:
> >> >> > The point of this patch is to reset the boot VCPU, so we reset the VCPU
> >> >> > that is made runnable by the KVM_SET_MP_STATE IOCTL.
> >> >>
> >> >> Like I said before, we don't need to do this. The initiating VCPU
> >> >> can be resetted just before exiting to user space for system reset
> >> >> event exit.
> >>
> >> You assume initiating VCPU == boot VCPU.
> >>
> >> We should prevent KVM_SET_MP_STATE IOCTL for all non-initiating VCPUs if
> >> we decide to accept the assumption.
> >
> > There is no such assumption.
>
> You probably haven't intended it:
>
>   1) VCPU 0 is "chilling" in userspace.
>   2) VCPU 1 initiates SBI reset.
>   3) VCPU 1 makes a reset request to VCPU 0.
>   4) VCPU 1 returns to userspace.
>   5) Userspace knows it should reset the VM.
>   6) VCPU 0 still hasn't entered KVM.
>   7) Userspace sets the initial state of VCPU 0 and enters KVM.
>   8) VCPU 0 is reset in KVM, because of the pending request.
>   9) The initial boot state from userspace is lost.
>
> >> I'd rather choose a different design, though.
> >>
> >> How about a new userspace interface for IOCTL reset?
> >> (Can be capability toggle for KVM_SET_MP_STATE or a straight new IOCTL.)
> >>
> >> That wouldn't "fix" current userspaces, but would significantly improve
> >> the sanity of the KVM interface.
> >
> > I believe the current implementation needs a few improvements
> > that's all. We certainly don't need to introduce any new IOCTL.
>
> I do too.  The whole patch could have been a single line:
>
> diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c
> index d3d957a9e5c4..b3e6ad87e1cd 100644
> --- a/arch/riscv/kvm/vcpu.c
> +++ b/arch/riscv/kvm/vcpu.c
> @@ -511,6 +511,7 @@ int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu,
>
>         switch (mp_state->mp_state) {
>         case KVM_MP_STATE_RUNNABLE:
> +               kvm_riscv_reset_vcpu(vcpu);
>                 WRITE_ONCE(vcpu->arch.mp_state, *mp_state);
>                 break;
>         case KVM_MP_STATE_STOPPED:
>
> It is the backward compatibility and trying to fix current userspaces
> that's making it ugly.  I already gave up on the latter, so we can have
> a decently clean solution with the former.
>
> > Also, keep in mind that so far we have avoided any RISC-V
> > specific KVM IOCTLs and we should try to keep it that way
> > as long as we can.
>
> We can re-use KVM_SET_MP_STATE and add a KVM capability.
> Userspace will opt-in to reset the VCPU through the existing IOCTL.
>
> This design will also allow userspace to trigger a VCPU reset without
> tearing down the whole VM.

Okay, lets go ahead with a KVM capability which user space can opt-in
for KVM_SET_MP_STATE ioctl().

Keep in mind that at runtime Guest can still do CPU hotplug using SBI
HSM start/stop and do system suspend using SBI SUSP so we should
continue to have VCPU reset requests for both these SBI extensions.

Regards,
Anup