[PATCH v7 4/5] KVM: arm64: Allow cacheable stage 2 mapping using VMA flags
Catalin Marinas
catalin.marinas at arm.com
Wed Jun 18 09:34:16 PDT 2025
On Wed, Jun 18, 2025 at 06:55:40AM +0000, ankita at nvidia.com wrote:
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index a71b77df7c96..6a3955e07b5e 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -1660,6 +1660,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>
> is_vma_cacheable = kvm_vma_is_cacheable(vma);
>
> + /* Reject COW VM_PFNMAP */
> + if ((vma->vm_flags & VM_PFNMAP) && is_cow_mapping(vma->vm_flags))
> + return -EINVAL;
It may help to add a comment here why this needs to be rejected. I
forgot the details but tracked it down to an email from David a few
months ago:
https://lore.kernel.org/all/a2d95399-62ad-46d3-9e48-6fa90fd2c2f3@redhat.com/
> +
> /* Don't use the VMA after the unlock -- it may have vanished */
> vma = NULL;
>
> @@ -1684,9 +1688,6 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> return -EFAULT;
>
> if (!kvm_can_use_cmo_pfn(pfn)) {
> - if (is_vma_cacheable)
> - return -EINVAL;
> -
> /*
> * If the page was identified as device early by looking at
> * the VMA flags, vma_pagesize is already representing the
> @@ -1696,8 +1697,13 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> *
> * In both cases, we don't let transparent_hugepage_adjust()
> * change things at the last minute.
> + *
> + * Do not set device as the device memory is cacheable. Note
> + * that such mapping is safe as the KVM S2 will have the same
> + * Normal memory type as the VMA has in the S1.
> */
> - disable_cmo = true;
> + if (!is_vma_cacheable)
> + disable_cmo = true;
I'm tempted to stick to the 'device' variable name. Or something like
s2_noncacheable. As I commented, it's not just about disabling CMOs.
> } else if (logging_active && !write_fault) {
> /*
> * Only actually map the page as writable if this was a write
> @@ -1784,6 +1790,19 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> prot |= KVM_PGTABLE_PROT_X;
> }
>
> + /*
> + * When FWB is unsupported KVM needs to do cache flushes
> + * (via dcache_clean_inval_poc()) of the underlying memory. This is
> + * only possible if the memory is already mapped into the kernel map.
> + *
> + * Outright reject as the cacheable device memory is not present in
> + * the kernel map and not suitable for cache management.
> + */
> + if (is_vma_cacheable && !kvm_arch_supports_cacheable_pfnmap()) {
> + ret = -EINVAL;
> + goto out_unlock;
> + }
I'm missing the full context around this hunk but, judging by
indentation, does it also reject any cacheable vma even if it is not
PFNMAP on pre-FWB hardware?
--
Catalin
More information about the linux-arm-kernel
mailing list