[PATCH v2 0/1] KVM: arm64: Map GPU memory with no struct pages
Donald Dutile
ddutile at redhat.com
Tue Nov 26 09:10:24 PST 2024
My email client says this patch: [PATCH v2 1/1] KVM: arm64: Allow cacheable stage 2 mapping using VMA flags
is part of a thread for this titled patchPATCH. Is it?
The description has similarities to above description, but some adds, some drops.
So, could you clean these two up into (a) a series, or (b) single, separate PATCH's?
Thanks.
- Don
On 11/18/24 8:19 AM, ankita at nvidia.com wrote:
> From: Ankit Agrawal <ankita at nvidia.com>
>
> Grace based platforms such as Grace Hopper/Blackwell Superchips have
> CPU accessible cache coherent GPU memory. The current KVM code
> prevents such memory to be mapped Normal cacheable and the patch aims
> to solve this use case.
>
> Today KVM forces the memory to either NORMAL or DEVICE_nGnRE
> based on pfn_is_map_memory() and ignores the per-VMA flags that
> indicates the memory attributes. This means there is no way for
> a VM to get cachable IO memory (like from a CXL or pre-CXL device).
> In both cases the memory will be forced to be DEVICE_nGnRE and the
> VM's memory attributes will be ignored.
>
> The pfn_is_map_memory() is thus restrictive and allows only for
> the memory that is added to the kernel to be marked as cacheable.
> In most cases the code needs to know if there is a struct page, or
> if the memory is in the kernel map and pfn_valid() is an appropriate
> API for this. Extend the umbrella with pfn_valid() to include memory
> with no struct pages for consideration to be mapped cacheable in
> stage 2. A !pfn_valid() implies that the memory is unsafe to be mapped
> as cacheable.
>
> Also take care of the following two cases that are unsafe to be mapped
> as cacheable:
> 1. The VMA pgprot may have VM_IO set alongwith MT_NORMAL or MT_NORMAL_TAGGED.
> Although unexpected and wrong, presence of such configuration cannot
> be ruled out.
> 2. Configurations where VM_MTE_ALLOWED is not set and KVM_CAP_ARM_MTE
> is enabled. Otherwise a malicious guest can enable MTE at stage 1
> without the hypervisor being able to tell. This could cause external
> aborts.
>
> The GPU memory such as on the Grace Hopper systems is interchangeable
> with DDR memory and retains its properties. Executable faults should thus
> be allowed on the memory determined as Normal cacheable.
>
> Note when FWB is not enabled, the kernel expects to trivially do
> cache management by flushing the memory by linearly converting a
> kvm_pte to phys_addr to a KVA, see kvm_flush_dcache_to_poc(). This is
> only possibile for struct page backed memory. Do not allow non-struct
> page memory to be cachable without FWB.
>
> The changes are heavily influenced by the insightful discussions between
> Catalin Marinas and Jason Gunthorpe [1] on v1. Many thanks for their
> valuable suggestions.
>
> Applied over next-20241117 and tested on the Grace Hopper and
> Grace Blackwell platforms by booting up VM and running several CUDA
> workloads. This has not been tested on MTE enabled hardware. If
> someone can give it a try, it will be very helpful.
>
> v1 -> v2
> 1. Removed kvm_is_device_pfn() as a determiner for device type memory
> determination. Instead using pfn_valid()
> 2. Added handling for MTE.
> 3. Minor cleanup.
>
> Link: https://lore.kernel.org/lkml/20230907181459.18145-2-ankita@nvidia.com [1]
>
> Ankit Agrawal (1):
> KVM: arm64: Allow cacheable stage 2 mapping using VMA flags
>
> arch/arm64/include/asm/kvm_pgtable.h | 8 +++
> arch/arm64/kvm/hyp/pgtable.c | 2 +-
> arch/arm64/kvm/mmu.c | 101 +++++++++++++++++++++------
> 3 files changed, 87 insertions(+), 24 deletions(-)
>
More information about the linux-arm-kernel
mailing list