[RFC PATCH v5 02/38] KVM: arm64: Add lock/unlock memslot user API

Reiji Watanabe reijiw at google.com
Mon Feb 14 21:59:09 PST 2022


Hi Alex,

On Wed, Nov 17, 2021 at 7:37 AM Alexandru Elisei
<alexandru.elisei at arm.com> wrote:
>
> Stage 2 faults triggered by the profiling buffer attempting to write to
> memory are reported by the SPE hardware by asserting a buffer management
> event interrupt. Interrupts are by their nature asynchronous, which means
> that the guest might have changed its stage 1 translation tables since the
> attempted write. SPE reports the guest virtual address that caused the data
> abort, not the IPA, which means that KVM would have to walk the guest's
> stage 1 tables to find the IPA. Using the AT instruction to walk the
> guest's tables in hardware is not an option because it doesn't report the
> IPA in the case of a stage 2 fault on a stage 1 table walk.
>
> Avoid both issues by pre-mapping the guest memory at stage 2. This is being
> done by adding a capability that allows the user to pin the memory backing
> a memslot. The same capability can be used to unlock a memslot, which
> unpins the pages associated with the memslot, but doesn't unmap the IPA
> range from stage 2; in this case, the addresses will be unmapped from stage
> 2 via the MMU notifiers when the process' address space changes.
>
> For now, the capability doesn't actually do anything other than checking
> that the usage is correct; the memory operations will be added in future
> patches.
>
> Signed-off-by: Alexandru Elisei <alexandru.elisei at arm.com>
> ---
>  Documentation/virt/kvm/api.rst   | 57 ++++++++++++++++++++++++++
>  arch/arm64/include/asm/kvm_mmu.h |  3 ++
>  arch/arm64/kvm/arm.c             | 42 ++++++++++++++++++--
>  arch/arm64/kvm/mmu.c             | 68 ++++++++++++++++++++++++++++++++
>  include/uapi/linux/kvm.h         |  8 ++++
>  5 files changed, 174 insertions(+), 4 deletions(-)
>
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index aeeb071c7688..16aa59eae3d9 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -6925,6 +6925,63 @@ indicated by the fd to the VM this is called on.
>  This is intended to support intra-host migration of VMs between userspace VMMs,
>  upgrading the VMM process without interrupting the guest.
>
> +7.30 KVM_CAP_ARM_LOCK_USER_MEMORY_REGION
> +----------------------------------------
> +
> +:Architectures: arm64
> +:Target: VM
> +:Parameters: flags is one of KVM_ARM_LOCK_USER_MEMORY_REGION_FLAGS_LOCK or
> +                     KVM_ARM_LOCK_USER_MEMORY_REGION_FLAGS_UNLOCK
> +             args[0] is the slot number
> +             args[1] specifies the permisions when the memslot is locked or if
> +                     all memslots should be unlocked
> +
> +The presence of this capability indicates that KVM supports locking the memory
> +associated with the memslot, and unlocking a previously locked memslot.
> +
> +The 'flags' parameter is defined as follows:
> +
> +7.30.1 KVM_ARM_LOCK_USER_MEMORY_REGION_FLAGS_LOCK
> +-------------------------------------------------
> +
> +:Capability: 'flags' parameter to KVM_CAP_ARM_LOCK_USER_MEMORY_REGION
> +:Architectures: arm64
> +:Target: VM
> +:Parameters: args[0] contains the memory slot number
> +             args[1] contains the permissions for the locked memory:
> +                     KVM_ARM_LOCK_MEMORY_READ (mandatory) to map it with
> +                     read permissions and KVM_ARM_LOCK_MEMORY_WRITE
> +                     (optional) with write permissions

Nit: Those flag names don't match the ones in the code.
(Their names in the code are KVM_ARM_LOCK_MEM_READ/KVM_ARM_LOCK_MEM_WRITE)

What is the reason why KVM_ARM_LOCK_MEMORY_{READ,WRITE} flags need
to be specified even though memslot already has similar flags ??

> +:Returns: 0 on success; negative error code on failure
> +
> +Enabling this capability causes the memory described by the memslot to be
> +pinned in the process address space and the corresponding stage 2 IPA range
> +mapped at stage 2. The permissions specified in args[1] apply to both
> +mappings. The memory pinned with this capability counts towards the max
> +locked memory limit for the current process.
> +
> +The capability should be enabled when no VCPUs are in the kernel executing an
> +ioctl (and in particular, KVM_RUN); otherwise the ioctl will block until all
> +VCPUs have returned. The virtual memory range described by the memslot must be
> +mapped in the userspace process without any gaps. It is considered an error if
> +write permissions are specified for a memslot which logs dirty pages.
> +
> +7.30.2 KVM_ARM_LOCK_USER_MEMORY_REGION_FLAGS_UNLOCK
> +---------------------------------------------------
> +
> +:Capability: 'flags' parameter to KVM_CAP_ARM_LOCK_USER_MEMORY_REGION
> +:Architectures: arm64
> +:Target: VM
> +:Parameters: args[0] contains the memory slot number
> +             args[1] optionally contains the flag KVM_ARM_UNLOCK_MEM_ALL,
> +                     which unlocks all previously locked memslots.
> +:Returns: 0 on success; negative error code on failure
> +
> +Enabling this capability causes the memory pinned when locking the memslot
> +specified in args[0] to be unpinned, or, optionally, all memslots to be
> +unlocked. The IPA range is not unmapped from stage 2.
> +>>>>>>> 56641eee289e (KVM: arm64: Add lock/unlock memslot user API)

Nit: An unnecessary line.

If a memslot with read/write permission is locked with read only,
and then unlocked, can userspace expect stage 2 mapping for the
memslot to be updated with read/write ?
Can userspace delete the memslot that is locked (without unlocking) ?
If so, userspace can expect the corresponding range to be implicitly
unlocked, correct ?

Thanks,
Reiji

> +
>  8. Other capabilities.
>  ======================
>
> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> index 02d378887743..2c50734f048d 100644
> --- a/arch/arm64/include/asm/kvm_mmu.h
> +++ b/arch/arm64/include/asm/kvm_mmu.h
> @@ -216,6 +216,9 @@ static inline void __invalidate_icache_guest_page(void *va, size_t size)
>  void kvm_set_way_flush(struct kvm_vcpu *vcpu);
>  void kvm_toggle_cache(struct kvm_vcpu *vcpu, bool was_enabled);
>
> +int kvm_mmu_lock_memslot(struct kvm *kvm, u64 slot, u64 flags);
> +int kvm_mmu_unlock_memslot(struct kvm *kvm, u64 slot, u64 flags);
> +
>  static inline unsigned int kvm_get_vmid_bits(void)
>  {
>         int reg = read_sanitised_ftr_reg(SYS_ID_AA64MMFR1_EL1);
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index e9b4ad7b5c82..d49905d18cee 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -78,16 +78,43 @@ int kvm_arch_check_processor_compat(void *opaque)
>         return 0;
>  }
>
> +static int kvm_arm_lock_memslot_supported(void)
> +{
> +       return 0;
> +}
> +
> +static int kvm_lock_user_memory_region_ioctl(struct kvm *kvm,
> +                                            struct kvm_enable_cap *cap)
> +{
> +       u64 slot, action_flags;
> +       u32 action;
> +
> +       if (cap->args[2] || cap->args[3])
> +               return -EINVAL;
> +
> +       slot = cap->args[0];
> +       action = cap->flags;
> +       action_flags = cap->args[1];
> +
> +       switch (action) {
> +       case KVM_ARM_LOCK_USER_MEMORY_REGION_FLAGS_LOCK:
> +               return kvm_mmu_lock_memslot(kvm, slot, action_flags);
> +       case KVM_ARM_LOCK_USER_MEMORY_REGION_FLAGS_UNLOCK:
> +               return kvm_mmu_unlock_memslot(kvm, slot, action_flags);
> +       default:
> +               return -EINVAL;
> +       }
> +}
> +
>  int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
>                             struct kvm_enable_cap *cap)
>  {
>         int r;
>
> -       if (cap->flags)
> -               return -EINVAL;
> -
>         switch (cap->cap) {
>         case KVM_CAP_ARM_NISV_TO_USER:
> +               if (cap->flags)
> +                       return -EINVAL;
>                 r = 0;
>                 kvm->arch.return_nisv_io_abort_to_user = true;
>                 break;
> @@ -101,6 +128,11 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
>                 }
>                 mutex_unlock(&kvm->lock);
>                 break;
> +       case KVM_CAP_ARM_LOCK_USER_MEMORY_REGION:
> +               if (!kvm_arm_lock_memslot_supported())
> +                       return -EINVAL;
> +               r = kvm_lock_user_memory_region_ioctl(kvm, cap);
> +               break;
>         default:
>                 r = -EINVAL;
>                 break;
> @@ -168,7 +200,6 @@ vm_fault_t kvm_arch_vcpu_fault(struct kvm_vcpu *vcpu, struct vm_fault *vmf)
>         return VM_FAULT_SIGBUS;
>  }
>
> -
>  /**
>   * kvm_arch_destroy_vm - destroy the VM data structure
>   * @kvm:       pointer to the KVM struct
> @@ -276,6 +307,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>         case KVM_CAP_ARM_PTRAUTH_GENERIC:
>                 r = system_has_full_ptr_auth();
>                 break;
> +       case KVM_CAP_ARM_LOCK_USER_MEMORY_REGION:
> +               r = kvm_arm_lock_memslot_supported();
> +               break;
>         default:
>                 r = 0;
>         }
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index 326cdfec74a1..f65bcbc9ae69 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -1296,6 +1296,74 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu)
>         return ret;
>  }
>
> +int kvm_mmu_lock_memslot(struct kvm *kvm, u64 slot, u64 flags)
> +{
> +       struct kvm_memory_slot *memslot;
> +       int ret;
> +
> +       if (slot >= KVM_MEM_SLOTS_NUM)
> +               return -EINVAL;
> +
> +       if (!(flags & KVM_ARM_LOCK_MEM_READ))
> +               return -EINVAL;
> +
> +       mutex_lock(&kvm->lock);
> +       if (!kvm_lock_all_vcpus(kvm)) {
> +               ret = -EBUSY;
> +               goto out_unlock_kvm;
> +       }
> +       mutex_lock(&kvm->slots_lock);
> +
> +       memslot = id_to_memslot(kvm_memslots(kvm), slot);
> +       if (!memslot) {
> +               ret = -EINVAL;
> +               goto out_unlock_slots;
> +       }
> +       if ((flags & KVM_ARM_LOCK_MEM_WRITE) &&
> +           ((memslot->flags & KVM_MEM_READONLY) || memslot->dirty_bitmap)) {
> +               ret = -EPERM;
> +               goto out_unlock_slots;
> +       }
> +
> +       ret = -EINVAL;
> +
> +out_unlock_slots:
> +       mutex_unlock(&kvm->slots_lock);
> +       kvm_unlock_all_vcpus(kvm);
> +out_unlock_kvm:
> +       mutex_unlock(&kvm->lock);
> +       return ret;
> +}
> +
> +int kvm_mmu_unlock_memslot(struct kvm *kvm, u64 slot, u64 flags)
> +{
> +       bool unlock_all = flags & KVM_ARM_UNLOCK_MEM_ALL;
> +       struct kvm_memory_slot *memslot;
> +       int ret;
> +
> +       if (!unlock_all && slot >= KVM_MEM_SLOTS_NUM)
> +               return -EINVAL;
> +
> +       mutex_lock(&kvm->slots_lock);
> +
> +       if (unlock_all) {
> +               ret = -EINVAL;
> +               goto out_unlock_slots;
> +       }
> +
> +       memslot = id_to_memslot(kvm_memslots(kvm), slot);
> +       if (!memslot) {
> +               ret = -EINVAL;
> +               goto out_unlock_slots;
> +       }
> +
> +       ret = -EINVAL;
> +
> +out_unlock_slots:
> +       mutex_unlock(&kvm->slots_lock);
> +       return ret;
> +}
> +
>  bool kvm_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range)
>  {
>         if (!kvm->arch.mmu.pgt)
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 1daa45268de2..70c969967557 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -1131,6 +1131,7 @@ struct kvm_ppc_resize_hpt {
>  #define KVM_CAP_EXIT_ON_EMULATION_FAILURE 204
>  #define KVM_CAP_ARM_MTE 205
>  #define KVM_CAP_VM_MOVE_ENC_CONTEXT_FROM 206
> +#define KVM_CAP_ARM_LOCK_USER_MEMORY_REGION 207
>
>  #ifdef KVM_CAP_IRQ_ROUTING
>
> @@ -1483,6 +1484,13 @@ struct kvm_s390_ucas_mapping {
>  #define KVM_PPC_SVM_OFF                  _IO(KVMIO,  0xb3)
>  #define KVM_ARM_MTE_COPY_TAGS    _IOR(KVMIO,  0xb4, struct kvm_arm_copy_mte_tags)
>
> +/* Used by KVM_CAP_ARM_LOCK_USER_MEMORY_REGION */
> +#define KVM_ARM_LOCK_USER_MEMORY_REGION_FLAGS_LOCK     (1 << 0)
> +#define   KVM_ARM_LOCK_MEM_READ                                (1 << 0)
> +#define   KVM_ARM_LOCK_MEM_WRITE                       (1 << 1)
> +#define KVM_ARM_LOCK_USER_MEMORY_REGION_FLAGS_UNLOCK   (1 << 1)
> +#define   KVM_ARM_UNLOCK_MEM_ALL                       (1 << 0)
> +
>  /* ioctl for vm fd */
>  #define KVM_CREATE_DEVICE        _IOWR(KVMIO,  0xe0, struct kvm_create_device)
>
> --
> 2.33.1
>
> _______________________________________________
> kvmarm mailing list
> kvmarm at lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm



More information about the linux-arm-kernel mailing list