[PATCH v2 12/19] arm64: mm: Add definitions to support 5 levels of paging

Ryan Roberts ryan.roberts at arm.com
Mon Nov 28 08:17:00 PST 2022


On 24/11/2022 12:39, Ard Biesheuvel wrote:
> Add the required types and descriptor accessors to support 5 levels of
> paging in the common code. This is one of the prerequisites for
> supporting 52-bit virtual addressing with 4k pages.
> 
> Note that this does not cover the code that handles kernel mappings or
> the fixmap.
> 
> Signed-off-by: Ard Biesheuvel <ardb at kernel.org>
> ---
>  arch/arm64/include/asm/pgalloc.h       | 41 +++++++++++
>  arch/arm64/include/asm/pgtable-hwdef.h | 22 +++++-
>  arch/arm64/include/asm/pgtable-types.h |  6 ++
>  arch/arm64/include/asm/pgtable.h       | 75 +++++++++++++++++++-
>  arch/arm64/mm/mmu.c                    | 31 +++++++-
>  arch/arm64/mm/pgd.c                    | 15 +++-
>  6 files changed, 181 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/pgalloc.h b/arch/arm64/include/asm/pgalloc.h
> index 237224484d0f..cae8c648f462 100644
> --- a/arch/arm64/include/asm/pgalloc.h
> +++ b/arch/arm64/include/asm/pgalloc.h
> @@ -60,6 +60,47 @@ static inline void __p4d_populate(p4d_t *p4dp, phys_addr_t pudp, p4dval_t prot)
>  }
>  #endif	/* CONFIG_PGTABLE_LEVELS > 3 */
>  
> +#if CONFIG_PGTABLE_LEVELS > 4
> +
> +static inline void __pgd_populate(pgd_t *pgdp, phys_addr_t p4dp, pgdval_t prot)
> +{
> +	if (pgtable_l5_enabled())
> +		set_pgd(pgdp, __pgd(__phys_to_pgd_val(p4dp) | prot));
> +}
> +
> +static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgdp, p4d_t *p4dp)
> +{
> +	pgdval_t pgdval = PGD_TYPE_TABLE;
> +
> +	pgdval |= (mm == &init_mm) ? PGD_TABLE_UXN : PGD_TABLE_PXN;
> +	__pgd_populate(pgdp, __pa(p4dp), pgdval);
> +}
> +
> +static inline p4d_t *p4d_alloc_one(struct mm_struct *mm, unsigned long addr)
> +{
> +	gfp_t gfp = GFP_PGTABLE_USER;
> +
> +	if (mm == &init_mm)
> +		gfp = GFP_PGTABLE_KERNEL;
> +	return (p4d_t *)get_zeroed_page(gfp);
> +}
> +
> +static inline void p4d_free(struct mm_struct *mm, p4d_t *p4d)
> +{
> +	if (!pgtable_l5_enabled())
> +		return;
> +	BUG_ON((unsigned long)p4d & (PAGE_SIZE-1));
> +	free_page((unsigned long)p4d);
> +}
> +
> +#define __p4d_free_tlb(tlb, p4d, addr)  p4d_free((tlb)->mm, p4d)
> +#else
> +static inline void __pgd_populate(pgd_t *pgdp, phys_addr_t p4dp, pgdval_t prot)
> +{
> +	BUILD_BUG();
> +}
> +#endif	/* CONFIG_PGTABLE_LEVELS > 4 */
> +
>  extern pgd_t *pgd_alloc(struct mm_struct *mm);
>  extern void pgd_free(struct mm_struct *mm, pgd_t *pgdp);
>  
> diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
> index b91fe4781b06..b364b02e696b 100644
> --- a/arch/arm64/include/asm/pgtable-hwdef.h
> +++ b/arch/arm64/include/asm/pgtable-hwdef.h
> @@ -26,10 +26,10 @@
>  #define ARM64_HW_PGTABLE_LEVELS(va_bits) (((va_bits) - 4) / (PAGE_SHIFT - 3))
>  
>  /*
> - * Size mapped by an entry at level n ( 0 <= n <= 3)
> + * Size mapped by an entry at level n ( -1 <= n <= 3)
>   * We map (PAGE_SHIFT - 3) at all translation levels and PAGE_SHIFT bits
>   * in the final page. The maximum number of translation levels supported by
> - * the architecture is 4. Hence, starting at level n, we have further
> + * the architecture is 5. Hence, starting at level n, we have further
>   * ((4 - n) - 1) levels of translation excluding the offset within the page.
>   * So, the total number of bits mapped by an entry at level n is :
>   *

Is it neccessary to represent the levels as (-1 - 3) in the kernel or are you
open to switching to (0 - 4)?

There are a couple of other places where translation level is used, which I
found and fixed up for the KVM LPA2 support work. It got a bit messy to
represent the levels using the architectural range (-1 - 3) so I ended up
representing them as (0 - 4). The main issue was that KVM represents level as
unsigned so that change would have looked quite big.

Most of this is confined to KVM and the only place it really crosses over with
the kernel is at __tlbi_level(). Which makes me think you might be missing some
required changes (I didn't notice these in your other patches):

Looking at the TLB management stuff, I think there are some places you will need
to fix up to correctly handle the extra level in the kernel (e.g.
tlb_get_level(), flush_tlb_range()).

There are some new ecodings for level in the FSC field in the ESR. You might
need to update the fault_info array in fault.c to represent these and correctly
handle user space faults for the new level?


> [...]

Thanks,
Ryan





More information about the linux-arm-kernel mailing list