[PATCH v1 1/2] arm64/mm: Move PTE_PROT_NONE and PMD_PRESENT_INVALID

Ryan Roberts ryan.roberts at arm.com
Thu Apr 25 03:29:07 PDT 2024


On 25/04/2024 10:16, David Hildenbrand wrote:
> On 24.04.24 13:10, Ryan Roberts wrote:
>> Previously PTE_PROT_NONE was occupying bit 58, one of the bits reserved
>> for SW use when the PTE is valid. This is a waste of those precious SW
>> bits since PTE_PROT_NONE can only ever be set when valid is clear.
>> Instead let's overlay it on what would be a HW bit if valid was set.
>>
>> We need to be careful about which HW bit to choose since some of them
>> must be preserved; when pte_present() is true (as it is for a
>> PTE_PROT_NONE pte), it is legitimate for the core to call various
>> accessors, e.g. pte_dirty(), pte_write() etc. There are also some
>> accessors that are private to the arch which must continue to be
>> honoured, e.g. pte_user(), pte_user_exec() etc.
>>
>> So we choose to overlay PTE_UXN; This effectively means that whenever a
>> pte has PTE_PROT_NONE set, it will always report pte_user_exec() ==
>> false, which is obviously always correct.
>>
>> As a result of this change, we must shuffle the layout of the
>> arch-specific swap pte so that PTE_PROT_NONE is always zero and not
>> overlapping with any other field. As a result of this, there is no way
>> to keep the `type` field contiguous without conflicting with
>> PMD_PRESENT_INVALID (bit 59), which must also be 0 for a swap pte. So
>> let's move PMD_PRESENT_INVALID to bit 60.
> 
> A note that some archs split/re-combine type and/or offset, to make use of every
> bit possible :) But that's mostly relevant for 32bit.
> 
> (and as long as PFNs can still fit into the swp offset for migration entries etc.)

Yeah, I considered splitting the type or offset field to avoid moving
PMD_PRESENT_INVALID, but thought it was better to avoid the extra mask and shift.

> 
>>
>> In the end, this frees up bit 58 for future use as a proper SW bit (e.g.
>> soft-dirty or uffd-wp).
> 
> I was briefly confused about how you would use these bits as SW bits for swap
> PTEs (which you can't as they overlay the type). See below regarding bit 3.
> 
> I would have said here "proper SW bit for present PTEs".

Yes; I'll clarify in the next version.

> 
>>
>> Signed-off-by: Ryan Roberts <ryan.roberts at arm.com>
>> ---
>>   arch/arm64/include/asm/pgtable-prot.h |  4 ++--
>>   arch/arm64/include/asm/pgtable.h      | 16 +++++++++-------
>>   2 files changed, 11 insertions(+), 9 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/pgtable-prot.h
>> b/arch/arm64/include/asm/pgtable-prot.h
>> index dd9ee67d1d87..ef952d69fd04 100644
>> --- a/arch/arm64/include/asm/pgtable-prot.h
>> +++ b/arch/arm64/include/asm/pgtable-prot.h
>> @@ -18,14 +18,14 @@
>>   #define PTE_DIRTY        (_AT(pteval_t, 1) << 55)
>>   #define PTE_SPECIAL        (_AT(pteval_t, 1) << 56)
>>   #define PTE_DEVMAP        (_AT(pteval_t, 1) << 57)
>> -#define PTE_PROT_NONE        (_AT(pteval_t, 1) << 58) /* only when !PTE_VALID */
>> +#define PTE_PROT_NONE        (PTE_UXN)         /* Reuse PTE_UXN; only when
>> !PTE_VALID */
>>     /*
>>    * This bit indicates that the entry is present i.e. pmd_page()
>>    * still points to a valid huge page in memory even if the pmd
>>    * has been invalidated.
>>    */
>> -#define PMD_PRESENT_INVALID    (_AT(pteval_t, 1) << 59) /* only when
>> !PMD_SECT_VALID */
>> +#define PMD_PRESENT_INVALID    (_AT(pteval_t, 1) << 60) /* only when
>> !PMD_SECT_VALID */
>>     #define _PROT_DEFAULT        (PTE_TYPE_PAGE | PTE_AF | PTE_SHARED)
>>   #define _PROT_SECT_DEFAULT    (PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_S)
>> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
>> index afdd56d26ad7..23aabff4fa6f 100644
>> --- a/arch/arm64/include/asm/pgtable.h
>> +++ b/arch/arm64/include/asm/pgtable.h
>> @@ -1248,20 +1248,22 @@ static inline pmd_t pmdp_establish(struct
>> vm_area_struct *vma,
>>    * Encode and decode a swap entry:
>>    *    bits 0-1:    present (must be zero)
>>    *    bits 2:        remember PG_anon_exclusive
>> - *    bits 3-7:    swap type
>> - *    bits 8-57:    swap offset
>> - *    bit  58:    PTE_PROT_NONE (must be zero)
> 
> Reading this patch alone: what happened to bit 3? Please mention that that it
> will be used as a swap pte metadata bit (uffd-wp).

Will do. It's all a bit arbitrary though. I could have put offset in 3-52, and
then 53 would have been spare for uffd-wp. I'm not sure there is any advantage
to either option.

> 
>> + *    bits 4-53:    swap offset
> 
> So we'll still have 50bit for the offset, good. We could even use 61-63 if ever
> required to store bigger PFNs.

yep, or more sw bits.

> 
> LGTM
> 
> Reviewed-by: David Hildenbrand <david at redhat.com>

Thanks!





More information about the linux-arm-kernel mailing list