[PATCH v5 08/13] KVM: arm64: Implement PSCI SYSTEM_SUSPEND

Reiji Watanabe reijiw at google.com
Fri Apr 22 00:02:12 PDT 2022


On Sat, Apr 9, 2022 at 11:46 AM Oliver Upton <oupton at google.com> wrote:
>
> ARM DEN0022D.b 5.19 "SYSTEM_SUSPEND" describes a PSCI call that allows
> software to request that a system be placed in the deepest possible
> low-power state. Effectively, software can use this to suspend itself to
> RAM.
>
> Unfortunately, there really is no good way to implement a system-wide
> PSCI call in KVM. Any precondition checks done in the kernel will need
> to be repeated by userspace since there is no good way to protect a
> critical section that spans an exit to userspace. SYSTEM_RESET and
> SYSTEM_OFF are equally plagued by this issue, although no users have
> seemingly cared for the relatively long time these calls have been
> supported.
>
> The solution is to just make the whole implementation userspace's
> problem. Introduce a new system event, KVM_SYSTEM_EVENT_SUSPEND, that
> indicates to userspace a calling vCPU has invoked PSCI SYSTEM_SUSPEND.
> Additionally, add a CAP to get buy-in from userspace for this new exit
> type.
>
> Only advertise the SYSTEM_SUSPEND PSCI call if userspace has opted in.
> If a vCPU calls SYSTEM_SUSPEND, punt straight to userspace. Provide
> explicit documentation of userspace's responsibilites for the exit and
> point to the PSCI specification to describe the actual PSCI call.
>
> Signed-off-by: Oliver Upton <oupton at google.com>
> ---
>  Documentation/virt/kvm/api.rst    | 39 +++++++++++++++++++++++++++++++
>  arch/arm64/include/asm/kvm_host.h |  3 ++-
>  arch/arm64/kvm/arm.c              | 12 +++++++++-
>  arch/arm64/kvm/psci.c             | 25 ++++++++++++++++++++
>  include/uapi/linux/kvm.h          |  2 ++
>  5 files changed, 79 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index d104e34ad703..24e2fac2fea7 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -6015,6 +6015,7 @@ should put the acknowledged interrupt vector into the 'epr' field.
>    #define KVM_SYSTEM_EVENT_RESET          2
>    #define KVM_SYSTEM_EVENT_CRASH          3
>    #define KVM_SYSTEM_EVENT_WAKEUP         4
> +  #define KVM_SYSTEM_EVENT_SUSPENDED      5

Nit: This should be KVM_SYSTEM_EVENT_SUSPEND based on the code.
(a few more parts in the doc use KVM_SYSTEM_EVENT_SUSPENDED)

Otherwise,
Reviewed-by: Reiji Watanabe <reijiw at google.com>

Thanks,
Reiji


>                         __u32 type;
>                         __u64 flags;
>                 } system_event;
> @@ -6042,6 +6043,34 @@ Valid values for 'type' are:
>   - KVM_SYSTEM_EVENT_WAKEUP -- the exiting vCPU is in a suspended state and
>     KVM has recognized a wakeup event. Userspace may honor this event by
>     marking the exiting vCPU as runnable, or deny it and call KVM_RUN again.
> + - KVM_SYSTEM_EVENT_SUSPENDED -- the guest has requested a suspension of
> +   the VM.
> +
> +For arm/arm64:
> +^^^^^^^^^^^^^^
> +
> +   KVM_SYSTEM_EVENT_SUSPENDED exits are enabled with the
> +   KVM_CAP_ARM_SYSTEM_SUSPEND VM capability. If a guest invokes the PSCI
> +   SYSTEM_SUSPEND function, KVM will exit to userspace with this event
> +   type.
> +
> +   It is the sole responsibility of userspace to implement the PSCI
> +   SYSTEM_SUSPEND call according to ARM DEN0022D.b 5.19 "SYSTEM_SUSPEND".
> +   KVM does not change the vCPU's state before exiting to userspace, so
> +   the call parameters are left in-place in the vCPU registers.
> +
> +   Userspace is _required_ to take action for such an exit. It must
> +   either:
> +
> +    - Honor the guest request to suspend the VM. Userspace can request
> +      in-kernel emulation of suspension by setting the calling vCPU's
> +      state to KVM_MP_STATE_SUSPENDED. Userspace must configure the vCPU's
> +      state according to the parameters passed to the PSCI function when
> +      the calling vCPU is resumed. See ARM DEN0022D.b 5.19.1 "Intended use"
> +      for details on the function parameters.
> +
> +    - Deny the guest request to suspend the VM. See ARM DEN0022D.b 5.19.2
> +      "Caller responsibilities" for possible return values.
>
>  Valid flags are:
>
> @@ -7756,6 +7785,16 @@ At this time, KVM_PMU_CAP_DISABLE is the only capability.  Setting
>  this capability will disable PMU virtualization for that VM.  Usermode
>  should adjust CPUID leaf 0xA to reflect that the PMU is disabled.
>
> +8.36 KVM_CAP_ARM_SYSTEM_SUSPEND
> +-------------------------------
> +
> +:Capability: KVM_CAP_ARM_SYSTEM_SUSPEND
> +:Architectures: arm64
> +:Type: vm
> +
> +When enabled, KVM will exit to userspace with KVM_EXIT_SYSTEM_EVENT of
> +type KVM_SYSTEM_EVENT_SUSPEND to process the guest suspend request.
> +
>  9. Known KVM API problems
>  =========================
>
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 46027b9b80ca..9243115c9d7b 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -137,7 +137,8 @@ struct kvm_arch {
>          */
>  #define KVM_ARCH_FLAG_REG_WIDTH_CONFIGURED             3
>  #define KVM_ARCH_FLAG_EL1_32BIT                                4
> -
> +       /* PSCI SYSTEM_SUSPEND enabled for the guest */
> +#define KVM_ARCH_FLAG_SYSTEM_SUSPEND_ENABLED           5
>         unsigned long flags;
>
>         /*
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index e9641b86d375..1714aa55db9c 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -97,6 +97,10 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
>                 }
>                 mutex_unlock(&kvm->lock);
>                 break;
> +       case KVM_CAP_ARM_SYSTEM_SUSPEND:
> +               r = 0;
> +               set_bit(KVM_ARCH_FLAG_SYSTEM_SUSPEND_ENABLED, &kvm->arch.flags);
> +               break;
>         default:
>                 r = -EINVAL;
>                 break;
> @@ -210,6 +214,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>         case KVM_CAP_SET_GUEST_DEBUG:
>         case KVM_CAP_VCPU_ATTRIBUTES:
>         case KVM_CAP_PTP_KVM:
> +       case KVM_CAP_ARM_SYSTEM_SUSPEND:
>                 r = 1;
>                 break;
>         case KVM_CAP_SET_GUEST_DEBUG2:
> @@ -447,8 +452,13 @@ bool kvm_arm_vcpu_stopped(struct kvm_vcpu *vcpu)
>  static void kvm_arm_vcpu_suspend(struct kvm_vcpu *vcpu)
>  {
>         vcpu->arch.mp_state.mp_state = KVM_MP_STATE_SUSPENDED;
> +
> +       /*
> +        * Since this is only called from the intended vCPU, the target vCPU is
> +        * guaranteed to not be running. As such there is no need to kick the
> +        * target to handle the request.
> +        */
>         kvm_make_request(KVM_REQ_SUSPEND, vcpu);
> -       kvm_vcpu_kick(vcpu);
>  }
>
>  static bool kvm_arm_vcpu_suspended(struct kvm_vcpu *vcpu)
> diff --git a/arch/arm64/kvm/psci.c b/arch/arm64/kvm/psci.c
> index 362d2a898b83..58b5e2c2ff6a 100644
> --- a/arch/arm64/kvm/psci.c
> +++ b/arch/arm64/kvm/psci.c
> @@ -191,6 +191,11 @@ static void kvm_psci_system_reset2(struct kvm_vcpu *vcpu)
>                                  KVM_SYSTEM_EVENT_RESET_FLAG_PSCI_RESET2);
>  }
>
> +static void kvm_psci_system_suspend(struct kvm_vcpu *vcpu)
> +{
> +       kvm_vcpu_set_system_event_exit(vcpu, KVM_SYSTEM_EVENT_SUSPEND, 0);
> +}
> +
>  static void kvm_psci_narrow_to_32bit(struct kvm_vcpu *vcpu)
>  {
>         int i;
> @@ -296,6 +301,7 @@ static int kvm_psci_1_x_call(struct kvm_vcpu *vcpu, u32 minor)
>  {
>         unsigned long val = PSCI_RET_NOT_SUPPORTED;
>         u32 psci_fn = smccc_get_function(vcpu);
> +       struct kvm *kvm = vcpu->kvm;
>         u32 arg;
>         int ret = 1;
>
> @@ -327,6 +333,11 @@ static int kvm_psci_1_x_call(struct kvm_vcpu *vcpu, u32 minor)
>                 case ARM_SMCCC_VERSION_FUNC_ID:
>                         val = 0;
>                         break;
> +               case PSCI_1_0_FN_SYSTEM_SUSPEND:
> +               case PSCI_1_0_FN64_SYSTEM_SUSPEND:
> +                       if (test_bit(KVM_ARCH_FLAG_SYSTEM_SUSPEND_ENABLED, &kvm->arch.flags))
> +                               val = 0;
> +                       break;
>                 case PSCI_1_1_FN_SYSTEM_RESET2:
>                 case PSCI_1_1_FN64_SYSTEM_RESET2:
>                         if (minor >= 1)
> @@ -334,6 +345,20 @@ static int kvm_psci_1_x_call(struct kvm_vcpu *vcpu, u32 minor)
>                         break;
>                 }
>                 break;
> +       case PSCI_1_0_FN_SYSTEM_SUSPEND:
> +               kvm_psci_narrow_to_32bit(vcpu);
> +               fallthrough;
> +       case PSCI_1_0_FN64_SYSTEM_SUSPEND:
> +               /*
> +                * Return directly to userspace without changing the vCPU's
> +                * registers. Userspace depends on reading the SMCCC parameters
> +                * to implement SYSTEM_SUSPEND.
> +                */
> +               if (test_bit(KVM_ARCH_FLAG_SYSTEM_SUSPEND_ENABLED, &kvm->arch.flags)) {
> +                       kvm_psci_system_suspend(vcpu);
> +                       return 0;
> +               }
> +               break;
>         case PSCI_1_1_FN_SYSTEM_RESET2:
>                 kvm_psci_narrow_to_32bit(vcpu);
>                 fallthrough;
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 64e5f9d83a7a..752e4a5c3ce6 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -445,6 +445,7 @@ struct kvm_run {
>  #define KVM_SYSTEM_EVENT_RESET          2
>  #define KVM_SYSTEM_EVENT_CRASH          3
>  #define KVM_SYSTEM_EVENT_WAKEUP         4
> +#define KVM_SYSTEM_EVENT_SUSPEND        5
>                         __u32 type;
>                         __u64 flags;
>                 } system_event;
> @@ -1146,6 +1147,7 @@ struct kvm_ppc_resize_hpt {
>  #define KVM_CAP_S390_MEM_OP_EXTENSION 211
>  #define KVM_CAP_PMU_CAPABILITY 212
>  #define KVM_CAP_DISABLE_QUIRKS2 213
> +#define KVM_CAP_ARM_SYSTEM_SUSPEND 214
>
>  #ifdef KVM_CAP_IRQ_ROUTING
>
> --
> 2.35.1.1178.g4f1659d476-goog
>



More information about the kvm-riscv mailing list