[RFC PATCH] KVM: Ignore MMU notifiers for guest_memfd-only memslots

Wed Jun 17 06:41:41 PDT 2026

On 6/17/26 15:23, Alexandru Elisei wrote:
> Hi David,
> 
> On Mon, Jun 15, 2026 at 09:07:50PM +0200, David Hildenbrand wrote:
>> On 6/15/26 17:52, Alexandru Elisei wrote:
>>> For guest_memfd-only memslots (kvm_memslot_is_gmem_only() is true), the
>>> memory provider for the virtual machine is the guest_memfd file, not the
>>> userspace mapping. Faults are resolved using the guest_memfd page cache,
>>> and the permissions for the secondary MMU mapping depends exclusively on
>>> the memslot (i.e, if the memslot is read-only). How userspace happens to
>>> have the memory mmaped at fault time, or even if the memory is mapped at
>>> all into userspace, is not taken into consideration.
>>>
>>> guest_memfd memory is not evictable, is not movable and there's no backing
>>> storage. Once memory is allocated for an offset in guest_memfd file, the
>>> offset will not change, and that memory is not freed unless userspace
>>> explicitly punches a hole in the file. As a result, memory reclaim, page
>>> migration, page aging and dirty page tracking for the userspace mapping
>>> serve little purpose.
>>
>> I don't think any of that is relevant for the patch at hand?
>>
>> The thing is: invalidation (truncation, later migration, for any other reason)
>> is driven through guest_memfd notifications, not through unrelated page tables.
>>
>> If we don't lookup pages for the KVM MMU through the page table, then there is
>> also no need for MMU notifiers. It's all guest_memfd only.
>>
>> Or am I missing something?
> 
> My thinking was that, because guest_memfd is not evictable, there is no need to
> do page ageing, which would require that secondary MMU mappings be made old.

Not really.

The KVM MMU did not obtain the folios through the page tables, but directly
through guest_memfd. Any aging would, therefore, have to be done through
guest_memfd.

Which we don't support and don't want to support :)

That we happen to have a matching user space range that maps the guest_memfd is
just coincidence from a KVM MMU point of view.

> 
> The invalidate callbacks are also used when userspace memory is marked read-only
> for dirty state tracking. I was trying to explaing that, since there is no
> backing for the guest_memfd file, host doesn't need to keep track of dirty state
> for the memory, and ignoring the invalidate callbacks is correct for all cases.
> 
> I can drop the paragraph entirely, if you think that would make the commit
> message clearer.

I think the real motivation is:

"Mappings in the secondary MMU were established by obtaining folios from
guest_memfd directly, not by looking the folios up through the page tables
through GUP. Consequently, there is no relationship between the page tables and
the secondary MMU: MMU notifiers do not apply."

-- 
Cheers,

David