[PATCH v10 02/15] set_memory: add folio_{zap, restore}_direct_map helpers

Nikita Kalyazin kalyazin at amazon.com
Fri Mar 6 06:48:57 PST 2026



On 06/03/2026 14:17, David Hildenbrand (Arm) wrote:
> On 3/6/26 13:48, Nikita Kalyazin wrote:
>>
>>
>> On 05/03/2026 17:34, David Hildenbrand (Arm) wrote:
>>> On 1/26/26 17:47, Kalyazin, Nikita wrote:
>>>> From: Nikita Kalyazin <kalyazin at amazon.com>
>>>>
>>>> These allow guest_memfd to remove its memory from the direct map.
>>>> Only implement them for architectures that have direct map.
>>>> In folio_zap_direct_map(), flush TLB on architectures where
>>>> set_direct_map_valid_noflush() does not flush it internally.
>>>
>>> "Let's provide folio_{zap,restore}_direct_map helpers as preparation for
>>> supporting removal of the direct map for guest_memfd folios. ...
>>
>> Will update, thanks.
>>
>>>
>>>>
>>>> The new helpers need to be accessible to KVM on architectures that
>>>> support guest_memfd (x86 and arm64).  Since arm64 does not support
>>>> building KVM as a module, only export them on x86.
>>>>
>>>> Direct map removal gives guest_memfd the same protection that
>>>> memfd_secret does, such as hardening against Spectre-like attacks
>>>> through in-kernel gadgets.
>>>
>>> Would it be possible to convert mm/secretmem.c as well?
>>>
>>> There, we use
>>>
>>>           set_direct_map_invalid_noflush(folio_page(folio, 0));
>>>
>>> and
>>>
>>>           set_direct_map_default_noflush(folio_page(folio, 0));
>>>
>>> Which is a bit different to below code. At least looking at the x86
>>> variants, I wonder why we don't simply use
>>> set_direct_map_valid_noflush().
>>>
>>>
>>> If so, can you add a patch to do the conversion, pleeeeassse ? :)
>>
>> Absolutely!
>>
>>>
>>>>
>>>> Reviewed-by: Ackerley Tng <ackerleytng at google.com>
>>>> Signed-off-by: Nikita Kalyazin <kalyazin at amazon.com>
>>>> ---
>>>>    arch/arm64/include/asm/set_memory.h     |  2 ++
>>>>    arch/arm64/mm/pageattr.c                | 12 ++++++++++++
>>>>    arch/loongarch/include/asm/set_memory.h |  2 ++
>>>>    arch/loongarch/mm/pageattr.c            | 12 ++++++++++++
>>>>    arch/riscv/include/asm/set_memory.h     |  2 ++
>>>>    arch/riscv/mm/pageattr.c                | 12 ++++++++++++
>>>>    arch/s390/include/asm/set_memory.h      |  2 ++
>>>>    arch/s390/mm/pageattr.c                 | 12 ++++++++++++
>>>>    arch/x86/include/asm/set_memory.h       |  2 ++
>>>>    arch/x86/mm/pat/set_memory.c            | 20 ++++++++++++++++++++
>>>>    include/linux/set_memory.h              | 10 ++++++++++
>>>>    11 files changed, 88 insertions(+)
>>>>
>>>> diff --git a/arch/arm64/include/asm/set_memory.h b/arch/arm64/
>>>> include/asm/set_memory.h
>>>> index c71a2a6812c4..49fd54f3c265 100644
>>>> --- a/arch/arm64/include/asm/set_memory.h
>>>> +++ b/arch/arm64/include/asm/set_memory.h
>>>> @@ -15,6 +15,8 @@ int set_direct_map_invalid_noflush(const void *addr);
>>>>    int set_direct_map_default_noflush(const void *addr);
>>>>    int set_direct_map_valid_noflush(const void *addr, unsigned long
>>>> numpages,
>>>>                                  bool valid);
>>>> +int folio_zap_direct_map(struct folio *folio);
>>>> +int folio_restore_direct_map(struct folio *folio);
>>>>    bool kernel_page_present(struct page *page);
>>>>
>>>>    int set_memory_encrypted(unsigned long addr, int numpages);
>>>> diff --git a/arch/arm64/mm/pageattr.c b/arch/arm64/mm/pageattr.c
>>>> index e2bdc3c1f992..0b88b0344499 100644
>>>> --- a/arch/arm64/mm/pageattr.c
>>>> +++ b/arch/arm64/mm/pageattr.c
>>>> @@ -356,6 +356,18 @@ int set_direct_map_valid_noflush(const void
>>>> *addr, unsigned long numpages,
>>>>         return set_memory_valid((unsigned long)addr, numpages, valid);
>>>>    }
>>>>
>>>> +int folio_zap_direct_map(struct folio *folio)
>>>> +{
>>>> +     return set_direct_map_valid_noflush(folio_address(folio),
>>>> +                                         folio_nr_pages(folio), false);
>>>> +}
>>>> +
>>>> +int folio_restore_direct_map(struct folio *folio)
>>>> +{
>>>> +     return set_direct_map_valid_noflush(folio_address(folio),
>>>> +                                         folio_nr_pages(folio), true);
>>>> +}
>>>
>>> Is there a good reason why we cannot have two generic inline functions
>>> that simply call set_direct_map_valid_noflush() ?
>>>
>>> Is it because of some flushing behavior? (which we could figure out)
>>
>> Yes, on x86 we need an explicit flush.  Other architectures deal with it
>> internally.
> 
> So, we call a _noflush function and it performs a ... flush. What.

Yeah, that's unfortunately the status quo as pointed by Aneesh [1]

[1] https://lore.kernel.org/kvm/yq5ajz07czvz.fsf@kernel.org/

> 
> Take a look at secretmem_fault(), where we do an unconditional
> flush_tlb_kernel_range().
> 
> Do we end up double-flushing in that case?

Yes, looks like that.  I'll remove the explicit flush and rely on 
folio_zap_direct_map().

> 
>> Do you propose a bespoke implementation for x86 and a
>> "generic" one for others?
> 
> We have to find a way to have a single set of functions for all archs
> that support directmap removal.

I believe Dave meant to address that with 
folio_{zap,restore}_direct_map() [2].

[2] 
https://lore.kernel.org/kvm/9409531b-589b-4a54-b122-06a3cf0846f3@intel.com/

> 
> One option might be to have some indication from the architecture that
> no flush_tlb_kernel_range() is required.
> 
> Could be a config option or some simple helper function.

I'd be inclined to know what arch maintainers think because I don't have 
a strong opinion on that.

> 
> --
> Cheers,
> 
> David




More information about the linux-riscv mailing list