[PATCH v3 08/18] arm/arm64: KVM: Add PSCI version selection API

Marc Zyngier marc.zyngier at arm.com
Mon Feb 5 01:30:45 PST 2018


On 04/02/18 12:38, Christoffer Dall wrote:
> Hi Marc,
> 
> [ I know we're discussing the overall approach in parallel, but here are
>   some comments on the specifics of this patch, should it end up being
>   used in some capacity ]
> 
> On Thu, Feb 01, 2018 at 11:46:47AM +0000, Marc Zyngier wrote:
>> Although we've implemented PSCI 1.0 and 1.1, nothing can select them
>> Since all the new PSCI versions are backward compatible, we decide to
>> default to the latest version of the PSCI implementation. This is no
>> different from doing a firmware upgrade on KVM.
>>
>> But in order to give a chance to hypothetical badly implemented guests
>> that would have a fit by discovering something other than PSCI 0.2,
>> let's provide a new API that allows userspace to pick one particular
>> version of the API.
>>
>> This is implemented as a new class of "firmware" registers, where
>> we expose the PSCI version. This allows the PSCI version to be
>> save/restored as part of a guest migration, and also set to
>> any supported version if the guest requires it.
>>
>> Signed-off-by: Marc Zyngier <marc.zyngier at arm.com>
>> ---
>>  Documentation/virtual/kvm/api.txt      |  3 +-
>>  Documentation/virtual/kvm/arm/psci.txt | 30 +++++++++++++++
>>  arch/arm/include/asm/kvm_host.h        |  3 ++
>>  arch/arm/include/uapi/asm/kvm.h        |  6 +++
>>  arch/arm/kvm/guest.c                   | 13 +++++++
>>  arch/arm64/include/asm/kvm_host.h      |  3 ++
>>  arch/arm64/include/uapi/asm/kvm.h      |  6 +++
>>  arch/arm64/kvm/guest.c                 | 14 ++++++-
>>  include/kvm/arm_psci.h                 |  9 +++++
>>  virt/kvm/arm/psci.c                    | 68 +++++++++++++++++++++++++++++++++-
>>  10 files changed, 151 insertions(+), 4 deletions(-)
>>  create mode 100644 Documentation/virtual/kvm/arm/psci.txt
>>
>> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
>> index 57d3ee9e4bde..334905202141 100644
>> --- a/Documentation/virtual/kvm/api.txt
>> +++ b/Documentation/virtual/kvm/api.txt
>> @@ -2493,7 +2493,8 @@ Possible features:
>>    and execute guest code when KVM_RUN is called.
>>   - KVM_ARM_VCPU_EL1_32BIT: Starts the CPU in a 32bit mode.
>>    Depends on KVM_CAP_ARM_EL1_32BIT (arm64 only).
>> - - KVM_ARM_VCPU_PSCI_0_2: Emulate PSCI v0.2 for the CPU.
>> + - KVM_ARM_VCPU_PSCI_0_2: Emulate PSCI v0.2 (or a future revision
>> +          backward compatible with v0.2) for the CPU.
>>    Depends on KVM_CAP_ARM_PSCI_0_2.
>>   - KVM_ARM_VCPU_PMU_V3: Emulate PMUv3 for the CPU.
>>    Depends on KVM_CAP_ARM_PMU_V3.
> 
> Can we add this to api.txt as well ?:
> 
> --------8><----------
> 
> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
> index fc3ae951bc07..c88aa04bcbe8 100644
> --- a/Documentation/virtual/kvm/api.txt
> +++ b/Documentation/virtual/kvm/api.txt
> @@ -1959,6 +1959,8 @@ arm64 CCSIDR registers are demultiplexed by CSSELR value:
>  arm64 system registers have the following id bit patterns:
>    0x6030 0000 0013 <op0:2> <op1:3> <crn:4> <crm:4> <op2:3>
>  
> +ARM/arm64 firmware pseudo-registers have the following bit pattern:
> +  0x4030 0000 0014 <regno:16>
>  
>  MIPS registers are mapped using the lower 32 bits.  The upper 16 of that is
>  the register group type:
> 
> --------8><----------

Ah, I never realised we actually documented this. Neat!

> 
>> diff --git a/Documentation/virtual/kvm/arm/psci.txt b/Documentation/virtual/kvm/arm/psci.txt
>> new file mode 100644
>> index 000000000000..aafdab887b04
>> --- /dev/null
>> +++ b/Documentation/virtual/kvm/arm/psci.txt
>> @@ -0,0 +1,30 @@
>> +KVM implements the PSCI (Power State Coordination Interface)
>> +specification in order to provide services such as CPU on/off, reset
>> +and power-off to the guest.
>> +
>> +The PSCI specification is regularly updated to provide new features,
>> +and KVM implements these updates if they make sense from a virtualization
>> +point of view.
>> +
>> +This means that a guest booted on two different versions of KVM can
>> +observe two different "firmware" revisions. This could cause issues if
>> +a given guest is tied to a particular PSCI revision (unlikely), or if
>> +a migration causes a different PSCI version to be exposed out of the
>> +blue to an unsuspecting guest.
>> +
>> +In order to remedy this situation, KVM exposes a set of "firmware
>> +pseudo-registers" that can be manipulated using the GET/SET_ONE_REG
>> +interface. These registers can be saved/restored by userspace, and set
>> +to a convenient value if required.
>> +
>> +The following register is defined:
>> +
>> +* KVM_REG_ARM_PSCI_VERSION:
>> +
>> +  - Only valid if the vcpu has the KVM_ARM_VCPU_PSCI_0_2 feature set
>> +    (and thus has already been initialized)
>> +  - Returns the current PSCI version on GET_ONE_REG (defaulting to the
>> +    highest PSCI version implemented by KVM and compatible with v0.2)
>> +  - Allows any PSCI version implemented by KVM and compatible with
>> +    v0.2 to be set with SET_ONE_REG
>> +  - Affects the whole VM (even if the register view is per-vcpu)
>> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
>> index acbf9ec7b396..e9d57060d88c 100644
>> --- a/arch/arm/include/asm/kvm_host.h
>> +++ b/arch/arm/include/asm/kvm_host.h
>> @@ -75,6 +75,9 @@ struct kvm_arch {
>>   /* Interrupt controller */
>>   struct vgic_dist vgic;
>>   int max_vcpus;
>> +
>> + /* Mandated version of PSCI */
>> + u32 psci_version;
>>  };
>>  
>>  #define KVM_NR_MEM_OBJS     40
>> diff --git a/arch/arm/include/uapi/asm/kvm.h b/arch/arm/include/uapi/asm/kvm.h
>> index 6edd177bb1c7..47dfc99f5cd0 100644
>> --- a/arch/arm/include/uapi/asm/kvm.h
>> +++ b/arch/arm/include/uapi/asm/kvm.h
>> @@ -186,6 +186,12 @@ struct kvm_arch_memory_slot {
>>  #define KVM_REG_ARM_VFP_FPINST 0x1009
>>  #define KVM_REG_ARM_VFP_FPINST2 0x100A
>>  
>> +/* KVM-as-firmware specific pseudo-registers */
>> +#define KVM_REG_ARM_FW (0x0014 << KVM_REG_ARM_COPROC_SHIFT)
>> +#define KVM_REG_ARM_FW_REG(r) (KVM_REG_ARM | KVM_REG_SIZE_U64 | \
>> + KVM_REG_ARM_FW | ((r) & 0xffff))
>> +#define KVM_REG_ARM_PSCI_VERSION KVM_REG_ARM_FW_REG(0)
>> +
>>  /* Device Control API: ARM VGIC */
>>  #define KVM_DEV_ARM_VGIC_GRP_ADDR 0
>>  #define KVM_DEV_ARM_VGIC_GRP_DIST_REGS 1
>> diff --git a/arch/arm/kvm/guest.c b/arch/arm/kvm/guest.c
>> index 1e0784ebbfd6..a18f33edc471 100644
>> --- a/arch/arm/kvm/guest.c
>> +++ b/arch/arm/kvm/guest.c
>> @@ -22,6 +22,7 @@
>>  #include <linux/module.h>
>>  #include <linux/vmalloc.h>
>>  #include <linux/fs.h>
>> +#include <kvm/arm_psci.h>
>>  #include <asm/cputype.h>
>>  #include <linux/uaccess.h>
>>  #include <asm/kvm.h>
>> @@ -176,6 +177,7 @@ static unsigned long num_core_regs(void)
>>  unsigned long kvm_arm_num_regs(struct kvm_vcpu *vcpu)
>>  {
>>   return num_core_regs() + kvm_arm_num_coproc_regs(vcpu)
>> + + kvm_arm_get_fw_num_regs(vcpu)
>>   + NUM_TIMER_REGS;
>>  }
>>  
>> @@ -196,6 +198,11 @@ int kvm_arm_copy_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
>>   uindices++;
>>   }
>>  
>> + ret = kvm_arm_copy_fw_reg_indices(vcpu, uindices);
>> + if (ret)
>> + return ret;
>> + uindices += kvm_arm_get_fw_num_regs(vcpu);
>> +
>>   ret = copy_timer_indices(vcpu, uindices);
>>   if (ret)
>>   return ret;
>> @@ -214,6 +221,9 @@ int kvm_arm_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>>   if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_CORE)
>>   return get_core_reg(vcpu, reg);
>>  
>> + if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_FW)
>> + return kvm_arm_get_fw_reg(vcpu, reg);
>> +
>>   if (is_timer_reg(reg->id))
>>   return get_timer_reg(vcpu, reg);
>>  
>> @@ -230,6 +240,9 @@ int kvm_arm_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>>   if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_CORE)
>>   return set_core_reg(vcpu, reg);
>>  
>> + if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_FW)
>> + return kvm_arm_set_fw_reg(vcpu, reg);
>> +
>>   if (is_timer_reg(reg->id))
>>   return set_timer_reg(vcpu, reg);
>>  
>> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
>> index 4485ae8e98de..10af386642c6 100644
>> --- a/arch/arm64/include/asm/kvm_host.h
>> +++ b/arch/arm64/include/asm/kvm_host.h
>> @@ -73,6 +73,9 @@ struct kvm_arch {
>>  
>>   /* Interrupt controller */
>>   struct vgic_dist vgic;
>> +
>> + /* Mandated version of PSCI */
>> + u32 psci_version;
>>  };
>>  
>>  #define KVM_NR_MEM_OBJS     40
>> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
>> index 9abbf3044654..04b3256f8e6d 100644
>> --- a/arch/arm64/include/uapi/asm/kvm.h
>> +++ b/arch/arm64/include/uapi/asm/kvm.h
>> @@ -206,6 +206,12 @@ struct kvm_arch_memory_slot {
>>  #define KVM_REG_ARM_TIMER_CNT ARM64_SYS_REG(3, 3, 14, 3, 2)
>>  #define KVM_REG_ARM_TIMER_CVAL ARM64_SYS_REG(3, 3, 14, 0, 2)
>>  
>> +/* KVM-as-firmware specific pseudo-registers */
>> +#define KVM_REG_ARM_FW (0x0014 << KVM_REG_ARM_COPROC_SHIFT)
>> +#define KVM_REG_ARM_FW_REG(r) (KVM_REG_ARM64 | KVM_REG_SIZE_U64 | \
>> + KVM_REG_ARM_FW | ((r) & 0xffff))
>> +#define KVM_REG_ARM_PSCI_VERSION KVM_REG_ARM_FW_REG(0)
>> +
>>  /* Device Control API: ARM VGIC */
>>  #define KVM_DEV_ARM_VGIC_GRP_ADDR 0
>>  #define KVM_DEV_ARM_VGIC_GRP_DIST_REGS 1
>> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
>> index 5c7f657dd207..811f04c5760e 100644
>> --- a/arch/arm64/kvm/guest.c
>> +++ b/arch/arm64/kvm/guest.c
>> @@ -25,6 +25,7 @@
>>  #include <linux/module.h>
>>  #include <linux/vmalloc.h>
>>  #include <linux/fs.h>
>> +#include <kvm/arm_psci.h>
>>  #include <asm/cputype.h>
>>  #include <linux/uaccess.h>
>>  #include <asm/kvm.h>
>> @@ -205,7 +206,7 @@ static int get_timer_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>>  unsigned long kvm_arm_num_regs(struct kvm_vcpu *vcpu)
>>  {
>>   return num_core_regs() + kvm_arm_num_sys_reg_descs(vcpu)
>> -                + NUM_TIMER_REGS;
>> + + kvm_arm_get_fw_num_regs(vcpu) + NUM_TIMER_REGS;
>>  }
>>  
>>  /**
>> @@ -225,6 +226,11 @@ int kvm_arm_copy_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
>>   uindices++;
>>   }
>>  
>> + ret = kvm_arm_copy_fw_reg_indices(vcpu, uindices);
>> + if (ret)
>> + return ret;
>> + uindices += kvm_arm_get_fw_num_regs(vcpu);
>> +
>>   ret = copy_timer_indices(vcpu, uindices);
>>   if (ret)
>>   return ret;
>> @@ -243,6 +249,9 @@ int kvm_arm_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>>   if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_CORE)
>>   return get_core_reg(vcpu, reg);
>>  
>> + if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_FW)
>> + return kvm_arm_get_fw_reg(vcpu, reg);
>> +
>>   if (is_timer_reg(reg->id))
>>   return get_timer_reg(vcpu, reg);
>>  
>> @@ -259,6 +268,9 @@ int kvm_arm_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>>   if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_CORE)
>>   return set_core_reg(vcpu, reg);
>>  
>> + if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_FW)
>> + return kvm_arm_set_fw_reg(vcpu, reg);
>> +
>>   if (is_timer_reg(reg->id))
>>   return set_timer_reg(vcpu, reg);
>>  
>> diff --git a/include/kvm/arm_psci.h b/include/kvm/arm_psci.h
>> index 5446435457c2..4ee098c39e01 100644
>> --- a/include/kvm/arm_psci.h
>> +++ b/include/kvm/arm_psci.h
>> @@ -24,7 +24,16 @@
>>  #define KVM_ARM_PSCI_0_2 PSCI_VERSION(0, 2)
>>  #define KVM_ARM_PSCI_1_0 PSCI_VERSION(1, 0)
>>  
>> +#define KVM_ARM_PSCI_LATEST KVM_ARM_PSCI_1_0
>> +
>>  int kvm_psci_version(struct kvm_vcpu *vcpu);
>>  int kvm_psci_call(struct kvm_vcpu *vcpu);
>>  
>> +struct kvm_one_reg;
>> +
>> +int kvm_arm_get_fw_num_regs(struct kvm_vcpu *vcpu);
>> +int kvm_arm_copy_fw_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices);
>> +int kvm_arm_get_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
>> +int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
>> +
>>  #endif /* __KVM_ARM_PSCI_H__ */
>> diff --git a/virt/kvm/arm/psci.c b/virt/kvm/arm/psci.c
>> index 291874cff85e..5c8366b71639 100644
>> --- a/virt/kvm/arm/psci.c
>> +++ b/virt/kvm/arm/psci.c
>> @@ -17,6 +17,7 @@
>>  
>>  #include <linux/preempt.h>
>>  #include <linux/kvm_host.h>
>> +#include <linux/uaccess.h>
>>  #include <linux/wait.h>
>>  
>>  #include <asm/cputype.h>
>> @@ -233,8 +234,19 @@ static void kvm_psci_system_reset(struct kvm_vcpu *vcpu)
>>  
>>  int kvm_psci_version(struct kvm_vcpu *vcpu)
>>  {
>> - if (test_bit(KVM_ARM_VCPU_PSCI_0_2, vcpu->arch.features))
>> - return KVM_ARM_PSCI_0_2;
>> + /*
>> + * Our PSCI implementation stays the same across versions from
>> + * v0.2 onward, only adding the few mandatory functions (such
>> + * as FEATURES with 1.0) that are required by newer
>> + * revisions. It is thus safe to return the latest, unless
>> + * userspace has instructed us otherwise.
>> + */
>> + if (test_bit(KVM_ARM_VCPU_PSCI_0_2, vcpu->arch.features)) {
>> + if (vcpu->kvm->arch.psci_version)
>> + return vcpu->kvm->arch.psci_version;
>> +
>> + return KVM_ARM_PSCI_LATEST;
>> + }
>>  
>>   return KVM_ARM_PSCI_0_1;
>>  }
>> @@ -406,3 +418,55 @@ int kvm_psci_call(struct kvm_vcpu *vcpu)
>>   return -EINVAL;
>>   };
>>  }
>> +
>> +int kvm_arm_get_fw_num_regs(struct kvm_vcpu *vcpu)
>> +{
>> + return 1; /* PSCI version */
>> +}
>> +
>> +int kvm_arm_copy_fw_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
>> +{
>> + if (put_user(KVM_REG_ARM_PSCI_VERSION, uindices))
>> + return -EFAULT;
>> +
>> + return 0;
>> +}
>> +
>> +int kvm_arm_get_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>> +{
>> + if (reg->id == KVM_REG_ARM_PSCI_VERSION) {
>> + void __user *uaddr = (void __user *)(long)reg->addr;
>> + u64 val;
>> +
>> + val = kvm_psci_version(vcpu);
>> + if (val == KVM_ARM_PSCI_0_1)
>> + return -EINVAL;
> 
> Doesn't this potentially break userspace that doesn't set
> KVM_ARM_VCPU_PSCI_0_2 which does KVM_GET_REG_LIST followed by
> KVM_GET_ONE_REG ?

Ouch. Yes, well spotted.

> What is the rationale for not simply allowing PSCI v0.1 (terribly
> outdated as it may be) be exposed via this firmware reg?

The rational was that PSCI_GET_VERSION didn't exist in 0.1, hence there
was no reason of exposing a version number if you didn't want at least
0.2. But as you made it obvious above, this is broken. That register
needs to be either exposed all the time, or only be exposed if
KVM_ARM_VCPU_PSCI_0_2 is set. Given that the latter feels pretty
fragile, I'll choose the former.

> 
>> + if (copy_to_user(uaddr, &val, KVM_REG_SIZE(reg->id)))
>> + return -EFAULT;
>> +
>> + return 0;
>> + }
>> +
>> + return -EINVAL;
>> +}
>> +
>> +int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>> +{
>> + if (reg->id == KVM_REG_ARM_PSCI_VERSION &&
>> +    test_bit(KVM_ARM_VCPU_PSCI_0_2, vcpu->arch.features)) {
>> + void __user *uaddr = (void __user *)(long)reg->addr;
>> + u64 val;
>> +
>> + if (copy_from_user(&val, uaddr, KVM_REG_SIZE(reg->id)))
>> + return -EFAULT;
>> +
>> + switch (val) {
>> + case KVM_ARM_PSCI_0_2:
>> + case KVM_ARM_PSCI_1_0:
>> + vcpu->kvm->arch.psci_version = val;
>> + return 0;
>> + }
> 
> Then here we could change the check so that setting KVM_ARM_PSCI_0_1
> when also having set KVM_ARM_VCPU_PSCI_0_2 returns an error, and
> similarly trying to set KVM_ARM_PSCI_0_2+ without having set the feature
> bit, returns an error.

Yes, that's a good suggestion.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...



More information about the linux-arm-kernel mailing list