[PATCH v8 36/43] arm64: mm: Add support for folding PUDs at runtime

Ryan Roberts ryan.roberts at arm.com
Wed Oct 2 02:08:26 PDT 2024


On 01/10/2024 07:23, Ard Biesheuvel wrote:
> On Mon, 30 Sept 2024 at 17:12, Ryan Roberts <ryan.roberts at arm.com> wrote:
>>
>> On 30/09/2024 15:53, Ard Biesheuvel wrote:
>>> On Mon, 30 Sept 2024 at 16:36, Ryan Roberts <ryan.roberts at arm.com> wrote:
>>>>
>>>> Hi Ard,
>>>>
>>>> On 14/02/2024 12:29, Ard Biesheuvel wrote:
>>>>> From: Ard Biesheuvel <ardb at kernel.org>
>>>>>
>>>>> In order to support LPA2 on 16k pages in a way that permits non-LPA2
>>>>> systems to run the same kernel image, we have to be able to fall back to
>>>>> at most 48 bits of virtual addressing.
>>>>>
>>>>> Falling back to 48 bits would result in a level 0 with only 2 entries,
>>>>> which is suboptimal in terms of TLB utilization. So instead, let's fall
>>>>> back to 47 bits in that case. This means we need to be able to fold PUDs
>>>>> dynamically, similar to how we fold P4Ds for 48 bit virtual addressing
>>>>> on LPA2 with 4k pages.
>>>>>
>>>>> Signed-off-by: Ard Biesheuvel <ardb at kernel.org>
>>>>
>>>> [...]
>>>>
>>>>>
>>>>> +#define pud_index(addr)              (((addr) >> PUD_SHIFT) & (PTRS_PER_PUD - 1))
>>>>> +
>>>>> +static inline pud_t *p4d_to_folded_pud(p4d_t *p4dp, unsigned long addr)
>>>>> +{
>>>>> +     return (pud_t *)PTR_ALIGN_DOWN(p4dp, PAGE_SIZE) + pud_index(addr);
>>>>> +}
>>>>> +
>>>>
>>>> I wonder if you could explain what this function (and its equivalents at other
>>>> levels) is doing? Why isn't it just returning p4dp cast to a (pud_t *)?
>>>>
>>>
>>> Because the p4dp index is derived from a different part of the VA, so
>>> it points into the right page but at the wrong entry.
>>
>> OK, yeah I think that's sunk in. TBH, the folding stuff melts my brain. Thanks
>> for the quick response. So the code is definitely correct, and it's needed
>> because the PxD_SHIFT and PTRS_PER_PxD are "wrong" for the real geometry.
>>
> 
> Indeed. I never considered putting this folding behavior in
> p?d_index() though - perhaps that is a better place for it to begin
> with.

Yes perhaps. For now, I've left your code as is for the compile-time variant,
and for boot-time variant, p4d_to_folded_pud() just returns the cast pointer.

> 
>>>
>>>> I'm working on a prototype for boot-time page size selection. For this, I'm
>>>> compile-time enabling all levels, then run-time folding the ones I don't need,
>>>> based on the selected page size and VA size.
>>>>
>>>
>>> Nice!
>>
>> Certainly in principle. Hoping to get an RFC out during October.
>>
> 
> While you're at it, the Android folks will probably give you a medal
> if you can manage 16k pages in user space, with 4k for the kernel and
> for compat tasks.

Ha - that's the ambition I started with. I have a design that I believe solved
all the issues except one; how to present procfs information about a process to
a different process with a larger page size. I felt there were likely going to
be other ABI confusion edge case like that lurking.

Eventually (with Rutland's help :) ) I conceded it was just a huge amount of
work. And after talking with the Android guys, decided to park it. Perhaps
something for the future, if there are other valid use cases. Boot time page
size selection has value on its own, but is orthogonal to per-process page size.

> 
> But seriously, I'd be happy to compare notes about this - one thing I
> have been meaning to do is to reduce the number of configurations we
> support, by always using 52 bits in the kernel, and allowing some kind
> of runtime folding for user space to reduce the depth where it
> matters.

OK that's interesting. I can see some cross-over there. I think I'm on the home
straight for an RFC. So how about I get that posted then we can have a chat?

> 
>>>
>>>> I'm trying to reuse your run-time folding code, but I have a case where this
>>>> function is broken as written. Replacing with "return (pud_t *)p4dp;" resolves
>>>> the problem; If VA_BITS=48 and pagesize=64K, the pgd has 64 entries. p4dp is
>>>> pointing to the correct entry in the pgd already, but this code aligns back to
>>>> the start of the page, then adds pud_index(), which is wrong because
>>>> PTRS_PER_PUD != PTRS_PER_PGDIR. (In my case, these 2 macros are actually
>>>> boot-time selected values rather than compile-time constants).
>>>>
>>>> I think your code is probably correct and working around PTRS_PER_PXD being
>>>> compile-time constants for the non-folded case, but I can't quite convince myself.
>>>>
>>>
>>> Is your p4dp pointing to the correct descriptor because you changed
>>> the runtime behavior of p4d_index() perhaps?
>>
>> Yes; in my prototype, P4D_SHIFT and PTRS_PER_P4D are set at runtime and end up
>> the same as they would have been if pgtable-nop4d.h was included for
>> compile-time folding.
>>
> 
> Yeah that makes sense. Good luck! :-)




More information about the linux-arm-kernel mailing list