[PATCH v11 10/16] KVM: guest_memfd: Add flag to remove from direct map
Nikita Kalyazin
kalyazin at amazon.com
Fri Apr 10 08:30:36 PDT 2026
On 23/03/2026 21:15, Ackerley Tng wrote:
> "Kalyazin, Nikita" <kalyazin at amazon.co.uk> writes:
>
>>
>> [...snip...]
>>
>> static vm_fault_t kvm_gmem_fault_user_mapping(struct vm_fault *vmf)
>> {
>> struct inode *inode = file_inode(vmf->vma->vm_file);
>> struct folio *folio;
>> vm_fault_t ret = VM_FAULT_LOCKED;
>> + int err;
>>
>> if (((loff_t)vmf->pgoff << PAGE_SHIFT) >= i_size_read(inode))
>> return VM_FAULT_SIGBUS;
>> @@ -418,6 +454,14 @@ static vm_fault_t kvm_gmem_fault_user_mapping(struct vm_fault *vmf)
>> folio_mark_uptodate(folio);
>> }
>>
>> + if (kvm_gmem_no_direct_map(folio_inode(folio))) {
>> + err = kvm_gmem_folio_zap_direct_map(folio);
>> + if (err) {
>> + ret = vmf_error(err);
>> + goto out_folio;
>> + }
>> + }
>> +
>> vmf->page = folio_file_page(folio, vmf->pgoff);
>>
>
> Sashiko pointed out that kvm_gmem_populate() might try and write to
> direct-map-removed folios, but I think that's handled because populate
> will first try and GUP folios, which is already blocked for
> direct-map-removed folios.
As far as I can see, it is a valid issue as populate only GUPs the
source pages, not gmem. I think this is similar to what was discussed
about TDX at some point and decided to exclude TDX support [1]. I
followed the same path and excluded SEV-SNP in the patch 8 [2]. I kept
your and David's "Reviewed-by:" for that patch, but please let me know
if this makes you change your minds.
[1] https://lore.kernel.org/kvm/aWpcDrGVLrZOqdcg@google.com
[2] https://lore.kernel.org/kvm/20260410151746.61150-9-kalyazin@amazon.com
>
>> out_folio:
>> @@ -528,6 +572,9 @@ static void kvm_gmem_free_folio(struct folio *folio)
>> kvm_pfn_t pfn = page_to_pfn(page);
>> int order = folio_order(folio);
>>
>> + if (kvm_gmem_folio_no_direct_map(folio))
>> + kvm_gmem_folio_restore_direct_map(folio);
>> +
>> kvm_arch_gmem_invalidate(pfn, pfn + (1ul << order));
>> }
>>
>
> Sashiko says to invalidate then restore direct map, I think in this case
> it doesn't matter since if the folio needed invalidation, it must be
> private, and the host shouldn't be writing to the private pages anyway.
>
> One benefit of retaining this order (restore, invalidate) is that it
> opens the invalidate hook to possibly do something regarding memory
> contents?
>
> Or perhaps we should just take the suggestion (invalidate, restore) and
> align that invalidate should not touch memory contents.
>
>> @@ -591,6 +638,9 @@ static int __kvm_gmem_create(struct kvm *kvm, loff_t size, u64 flags)
>> /* Unmovable mappings are supposed to be marked unevictable as well. */
>> WARN_ON_ONCE(!mapping_unevictable(inode->i_mapping));
>>
>> + if (flags & GUEST_MEMFD_FLAG_NO_DIRECT_MAP)
>> + mapping_set_no_direct_map(inode->i_mapping);
>> +
>> GMEM_I(inode)->flags = flags;
>>
>> file = alloc_file_pseudo(inode, kvm_gmem_mnt, name, O_RDWR, &kvm_gmem_fops);
>> @@ -803,13 +853,22 @@ int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot,
>> }
>>
>> r = kvm_gmem_prepare_folio(kvm, slot, gfn, folio);
>> + if (r)
>> + goto out_unlock;
>>
>> + if (kvm_gmem_no_direct_map(folio_inode(folio))) {
>> + r = kvm_gmem_folio_zap_direct_map(folio);
>> + if (r)
>> + goto out_unlock;
>> + }
>> +
>>
>> [...snip...]
>>
>
> Preparing a folio used to involve zeroing, but that has since been
> refactored out, so I believe zapping can come before preparing.
>
> Similar to the above point on invalidation: perhaps we should take the
> suggestion to zap then prepare
>
> + And align that preparation should not touch memory contents
> + Avoid needing to undo the preparation on zapping failure (.free_folio
> is not called on folio_put(), it is only called folio on removal from
> filemap).
I reordered both, thanks.
More information about the linux-riscv
mailing list