[PATCH v2 00/19] arm64: Enable LPA2 support for 4k and 16k pages

Ard Biesheuvel ardb at kernel.org
Fri Nov 25 01:35:48 PST 2022


On Fri, 25 Nov 2022 at 10:23, Ryan Roberts <ryan.roberts at arm.com> wrote:
>
> On 24/11/2022 17:14, Ard Biesheuvel wrote:
> > On Thu, 24 Nov 2022 at 15:39, Ryan Roberts <ryan.roberts at arm.com> wrote:
> >>
> >> Hi Ard,
> >>
> >> Thanks for including me on this. I'll plan to do a review over the next week or
> >> so, but in the meantime, I have a couple of general questions/comments:
> >>
> >> On 24/11/2022 12:39, Ard Biesheuvel wrote:
> >>> Enable support for LPA2 when running with 4k or 16k pages. In the former
> >>> case, this requires 5 level paging with a runtime fallback to 4 on
> >>> non-LPA2 hardware. For consistency, the same approach is adopted for 16k
> >>> pages, where we fall back to 3 level paging (47 bit virtual addressing)
> >>> on non-LPA2 configurations.
> >>
> >> It seems odd to me that if you have a non-LPA2 system, if you run a kernel that
> >> is compiled for 16KB pages and 48 VA bits, then you will get 48 VA bits. But if
> >> you run a kernel that is compiled for 16KB pages and 52 VA bits then you will
> >> get 47 VA bits? Wouldn't that pose a potential user space compat issue?
> >>
> >
> > Well, given that Android happily runs with 39-bit VAs to avoid 4 level
> > paging at all cost, I don't think that is a universal concern.
>
> Well presumably the Android kernel is always explicitly compiled for 39 VA bits
> so that's what user space is used to? I was really just making the point that if
> you have (the admittedly exotic and unlikely) case of having a 16KB kernel
> previously compiled for 48 VA bits, and you "upgrade" it to 52 VA bits now that
> the option is available, on HW without LPA2, this will actually be observed as a
> "downgrade" to 47 bits. If you previously wanted to limit to 3 levels of lookup
> with 16KB you would already have been compiling for 47 VA bits.
>

I am not debating that. I'm just saying that, without any hardware in
existence, it is difficult to predict which of these concerns is going
to dominate, and so I opted for the least messy and most symmetrical
approach.

> >
> > The benefit of this approach is that you can decide at runtime whether
> > you want to take the performance hit of 4 (or 5) level paging to get
> > access to the extended VA space.
> >
> >>> (Falling back to 48 bits would involve
> >>> finding a workaround for the fact that we cannot construct a level 0
> >>> table covering 52 bits of VA space that appears aligned to its size in
> >>> memory, and has the top 2 entries that represent the 48-bit region
> >>> appearing at an alignment of 64 bytes, which is required by the
> >>> architecture for TTBR address values.
> >>
> >> I'm not sure I've understood this. The level 0 table would need 32 entries for
> >> 52 VA bits so the table size is 256 bytes, naturally aligned to 256 bytes. 64 is
> >> a factor of 256 so surely the top 2 entries are guaranteed to also meet the
> >> constraint for the fallback path too?
> >>
> >
> > The top 2 entries are 16 bytes combined, and end on a 256 byte aligned
> > boundary so I don't see how they can start on a 64 byte aligned
> > boundary at the same time.
>
> I'm still not following; why would the 2 entry/16 byte table *end* on a 256 byte
> boundary? I guess I should go and read your patch before making assumptions, but
> my assumption from your description here was that you were optimistically
> allocating a 32 entry/256 byte table for the 52 VA bit case, then needing to
> reuse that table for the 2 entry/16 byte case if HW turns out not to support
> LPA2. In which case, surely the 2 entry table would be overlayed at the start
> (low address) of the allocated 32 entry table, and therefore its alignment is
> 256 bytes, which meets the HW's 64 byte alignment requirement?
>

No, it's at the end, that is the point. I am specifically referring to
TTBR1 upper region page tables here.

Please refer to the existing ttbr1_offset asm macro, which implements
this today for 64k pages + LVA. In this case, however, the condensed
table covers 6 bits of translation so it is naturally aligned to the
TTBR minimum alignment.

> >
> > My RFC had a workaround for this, but it is a bit nasty because we
> > need to copy those two entries at the right time and keep them in
> > sync.
> >
> >>> Also, using an additional level of
> >>> paging to translate a single VA bit is wasteful in terms of TLB
> >>> efficiency)
> >>>
> >>> This means support for falling back to 3 levels of paging at runtime
> >>> when configured for 4 is also needed.
> >>>
> >>> Another thing worth to note is that the repurposed physical address bits
> >>> in the page table descriptors were not RES0 before, and so there is now
> >>> a big global switch (called TCR.DS) which controls how all page table
> >>> descriptors are interpreted. This requires some extra care in the PTE
> >>> conversion helpers, and additional handling in the boot code to ensure
> >>> that we set TCR.DS safely if supported (and not overridden)
> >>>
> >>> Note that this series is mostly orthogonal to work by Anshuman done last
> >>> year: this series assumes that 52-bit physical addressing is never
> >>> needed to map the kernel image itself, and therefore that we never need
> >>> ID map range extension to cover the kernel with a 5th level when running
> >>> with 4.
> >>
> >> This limitation will certainly make it more tricky to test the the LPA2 stage2
> >> implementation that I have done. I've got scripts that construct host systems
> >> with all the RAM above 48 bits so that the output addresses in the stage2 page
> >> tables can be guaranteed to contain OAs > 48 bits. I think the work around here
> >> would be to place the RAM so that it straddles the 48 bit boundary such that the
> >> size of RAM below is the same size as the kernel image, and place the kernel
> >> image in it. Then this will ensure that the VM's memory still uses the RAM above
> >> the threshold. Or is there a simpler approach?
> >>
> >
> > No, that sounds reasonable. I'm using QEMU which happily lets you put
> > the start of DRAM at any address you can imagine (if you recompile it)
>
> I'm running on FVP, which will let me do this with runtime parameters. Anyway,
> I'll update my tests to cope with this constraint and run this patch set
> through, and I'll let you know if it spots anything.

Excellent, thanks.

> >
> > Another approach could be to simply stick a memblock_reserve()
> > somewhere that covers all 48-bit addressable memory, but you will need
> > some of both in any case.
> >
> >>> And given that the LPA2 architectural feature covers both the
> >>> virtual and physical range extensions, where enabling the latter is
> >>> required to enable the former, we can simplify things further by only
> >>> enabling them as a pair. (I.e., 52-bit physical addressing cannot be
> >>> enabled for 48-bit VA space or smaller)
> >>>
> >>> [...]
> >>
> >> Thanks,
> >> Ryan
> >>
> >>
>



More information about the linux-arm-kernel mailing list