[PATCH v2 12/19] arm64: mm: Add definitions to support 5 levels of paging

Ryan Roberts ryan.roberts at arm.com
Mon Nov 28 10:20:59 PST 2022


On 28/11/2022 18:00, Marc Zyngier wrote:
> On 2022-11-28 16:22, Ard Biesheuvel wrote:
>> On Mon, 28 Nov 2022 at 17:17, Ryan Roberts <ryan.roberts at arm.com> wrote:
>>>
>>> On 24/11/2022 12:39, Ard Biesheuvel wrote:
>>> > Add the required types and descriptor accessors to support 5 levels of
>>> > paging in the common code. This is one of the prerequisites for
>>> > supporting 52-bit virtual addressing with 4k pages.
>>> >
>>> > Note that this does not cover the code that handles kernel mappings or
>>> > the fixmap.
>>> >
>>> > Signed-off-by: Ard Biesheuvel <ardb at kernel.org>
>>> > ---
>>> >  arch/arm64/include/asm/pgalloc.h       | 41 +++++++++++
>>> >  arch/arm64/include/asm/pgtable-hwdef.h | 22 +++++-
>>> >  arch/arm64/include/asm/pgtable-types.h |  6 ++
>>> >  arch/arm64/include/asm/pgtable.h       | 75 +++++++++++++++++++-
>>> >  arch/arm64/mm/mmu.c                    | 31 +++++++-
>>> >  arch/arm64/mm/pgd.c                    | 15 +++-
>>> >  6 files changed, 181 insertions(+), 9 deletions(-)
>>> >
>>> > diff --git a/arch/arm64/include/asm/pgalloc.h
>>> b/arch/arm64/include/asm/pgalloc.h
>>> > index 237224484d0f..cae8c648f462 100644
>>> > --- a/arch/arm64/include/asm/pgalloc.h
>>> > +++ b/arch/arm64/include/asm/pgalloc.h
>>> > @@ -60,6 +60,47 @@ static inline void __p4d_populate(p4d_t *p4dp,
>>> phys_addr_t pudp, p4dval_t prot)
>>> >  }
>>> >  #endif       /* CONFIG_PGTABLE_LEVELS > 3 */
>>> >
>>> > +#if CONFIG_PGTABLE_LEVELS > 4
>>> > +
>>> > +static inline void __pgd_populate(pgd_t *pgdp, phys_addr_t p4dp, pgdval_t
>>> prot)
>>> > +{
>>> > +     if (pgtable_l5_enabled())
>>> > +             set_pgd(pgdp, __pgd(__phys_to_pgd_val(p4dp) | prot));
>>> > +}
>>> > +
>>> > +static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgdp, p4d_t
>>> *p4dp)
>>> > +{
>>> > +     pgdval_t pgdval = PGD_TYPE_TABLE;
>>> > +
>>> > +     pgdval |= (mm == &init_mm) ? PGD_TABLE_UXN : PGD_TABLE_PXN;
>>> > +     __pgd_populate(pgdp, __pa(p4dp), pgdval);
>>> > +}
>>> > +
>>> > +static inline p4d_t *p4d_alloc_one(struct mm_struct *mm, unsigned long addr)
>>> > +{
>>> > +     gfp_t gfp = GFP_PGTABLE_USER;
>>> > +
>>> > +     if (mm == &init_mm)
>>> > +             gfp = GFP_PGTABLE_KERNEL;
>>> > +     return (p4d_t *)get_zeroed_page(gfp);
>>> > +}
>>> > +
>>> > +static inline void p4d_free(struct mm_struct *mm, p4d_t *p4d)
>>> > +{
>>> > +     if (!pgtable_l5_enabled())
>>> > +             return;
>>> > +     BUG_ON((unsigned long)p4d & (PAGE_SIZE-1));
>>> > +     free_page((unsigned long)p4d);
>>> > +}
>>> > +
>>> > +#define __p4d_free_tlb(tlb, p4d, addr)  p4d_free((tlb)->mm, p4d)
>>> > +#else
>>> > +static inline void __pgd_populate(pgd_t *pgdp, phys_addr_t p4dp, pgdval_t
>>> prot)
>>> > +{
>>> > +     BUILD_BUG();
>>> > +}
>>> > +#endif       /* CONFIG_PGTABLE_LEVELS > 4 */
>>> > +
>>> >  extern pgd_t *pgd_alloc(struct mm_struct *mm);
>>> >  extern void pgd_free(struct mm_struct *mm, pgd_t *pgdp);
>>> >
>>> > diff --git a/arch/arm64/include/asm/pgtable-hwdef.h
>>> b/arch/arm64/include/asm/pgtable-hwdef.h
>>> > index b91fe4781b06..b364b02e696b 100644
>>> > --- a/arch/arm64/include/asm/pgtable-hwdef.h
>>> > +++ b/arch/arm64/include/asm/pgtable-hwdef.h
>>> > @@ -26,10 +26,10 @@
>>> >  #define ARM64_HW_PGTABLE_LEVELS(va_bits) (((va_bits) - 4) / (PAGE_SHIFT - 3))
>>> >
>>> >  /*
>>> > - * Size mapped by an entry at level n ( 0 <= n <= 3)
>>> > + * Size mapped by an entry at level n ( -1 <= n <= 3)
>>> >   * We map (PAGE_SHIFT - 3) at all translation levels and PAGE_SHIFT bits
>>> >   * in the final page. The maximum number of translation levels supported by
>>> > - * the architecture is 4. Hence, starting at level n, we have further
>>> > + * the architecture is 5. Hence, starting at level n, we have further
>>> >   * ((4 - n) - 1) levels of translation excluding the offset within the page.
>>> >   * So, the total number of bits mapped by an entry at level n is :
>>> >   *
>>>
>>> Is it neccessary to represent the levels as (-1 - 3) in the kernel or are you
>>> open to switching to (0 - 4)?
>>>
>>> There are a couple of other places where translation level is used, which I
>>> found and fixed up for the KVM LPA2 support work. It got a bit messy to
>>> represent the levels using the architectural range (-1 - 3) so I ended up
>>> representing them as (0 - 4). The main issue was that KVM represents level as
>>> unsigned so that change would have looked quite big.
>>>
>>> Most of this is confined to KVM and the only place it really crosses over with
>>> the kernel is at __tlbi_level(). Which makes me think you might be missing some
>>> required changes (I didn't notice these in your other patches):
>>>
>>> Looking at the TLB management stuff, I think there are some places you will need
>>> to fix up to correctly handle the extra level in the kernel (e.g.
>>> tlb_get_level(), flush_tlb_range()).
>>>
>>> There are some new ecodings for level in the FSC field in the ESR. You might
>>> need to update the fault_info array in fault.c to represent these and correctly
>>> handle user space faults for the new level?
>>>
>>
>> Hi Ryan,
>>
>> Thanks for pointing this out. Once I have educated myself a bit more
>> about all of this, I should be able to answer your questions :-)
>>
>> I did not do any user space testing in anger on this series, on the
>> assumption that we already support 52-bit VAs, but I completely missed
>> the fact that the additional level of paging requires additional
>> attention.
>>
>> As for the level indexing: I have a slight preference for sticking
>> with the architectural range, but I don't deeply care either way.
> 
> I'd really like to stick to the architectural representation, as
> there is an ingrained knowledge of the relation between a base
> granule size, a level, and a block mapping size.
> 
> The nice thing about level '-1' is that it preserve this behaviour,
> and doesn't force everyone to adjust. It also makes it extremely
> easy to compare the code and the spec.
> 
> So let's please stick to the [-1;3] range. It will save everyone
> a lot of trouble.

Fair point. It will mean a bigger patch, but I'll rework my stuff to make it all
work with [-1;3] before I post it.

> 
> Thanks,
> 
>         M.




More information about the linux-arm-kernel mailing list