[PATCH v7 5/7] arm64: vmemmap: Avoid base2 order of struct page size to dimension region

Mark Rutland mark.rutland at arm.com
Wed Dec 13 05:49:39 PST 2023


Hi Ard,

On Wed, Dec 13, 2023 at 09:40:30AM +0100, Ard Biesheuvel wrote:
> From: Ard Biesheuvel <ardb at kernel.org>
> 
> The placement and size of the vmemmap region in the kernel virtual
> address space is currently derived from the base2 order of the size of a
> struct page. This makes for nicely aligned constants with lots of
> leading 0xf and trailing 0x0 digits, but given that the actual struct
> pages are indexed as an ordinary array, this resulting region is
> severely overdimensioned when the size of a struct page is just over a
> power of 2.
> 
> This doesn't matter today, but once we enable 52-bit virtual addressing
> for 4k pages configurations, the vmemmap region may take up almost half
> of the upper VA region with the current struct page upper bound at 64
> bytes. And once we enable KMSAN or other features that push the size of
> a struct page over 64 bytes, we will run out of VMALLOC space entirely.
> 
> So instead, let's derive the region size from the actual size of a
> struct page, and place the entire region 1 GB from the top of the VA
> space, where it still doesn't share any lower level translation table
> entries with the fixmap.
> 
> Acked-by: Mark Rutland <mark.rutland at arm.com>
> Signed-off-by: Ard Biesheuvel <ardb at kernel.org>
> ---
>  arch/arm64/include/asm/memory.h | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
> index 2745bed8ae5b..b49575a92afc 100644
> --- a/arch/arm64/include/asm/memory.h
> +++ b/arch/arm64/include/asm/memory.h
> @@ -30,8 +30,8 @@
>   * keep a constant PAGE_OFFSET and "fallback" to using the higher end
>   * of the VMEMMAP where 52-bit support is not available in hardware.
>   */
> -#define VMEMMAP_SHIFT	(PAGE_SHIFT - STRUCT_PAGE_MAX_SHIFT)
> -#define VMEMMAP_SIZE	((_PAGE_END(VA_BITS_MIN) - PAGE_OFFSET) >> VMEMMAP_SHIFT)
> +#define VMEMMAP_RANGE	(_PAGE_END(VA_BITS_MIN) - PAGE_OFFSET)
> +#define VMEMMAP_SIZE	((VMEMMAP_RANGE >> PAGE_SHIFT) * sizeof(struct page))
>  
>  /*
>   * PAGE_OFFSET - the virtual address of the start of the linear map, at the
> @@ -47,8 +47,8 @@
>  #define MODULES_END		(MODULES_VADDR + MODULES_VSIZE)
>  #define MODULES_VADDR		(_PAGE_END(VA_BITS_MIN))
>  #define MODULES_VSIZE		(SZ_2G)
> -#define VMEMMAP_START		(-(UL(1) << (VA_BITS - VMEMMAP_SHIFT)))
> -#define VMEMMAP_END		(VMEMMAP_START + VMEMMAP_SIZE)
> +#define VMEMMAP_START		(VMEMMAP_END - VMEMMAP_SIZE)
> +#define VMEMMAP_END		(-UL(SZ_1G))
>  #define PCI_IO_START		(VMEMMAP_END + SZ_8M)
>  #define PCI_IO_END		(PCI_IO_START + PCI_IO_SIZE)
>  #define FIXADDR_TOP		(-UL(SZ_8M))

I realise I've acked this already, but my big concern here is still that it's
hard to see why these don't overlap (though the assert in fixmap.c will save
us). Usually we try to make that clear by construction, and I think we can do
that here with something like:

| #define GUARD_VA_SIZE          (UL(SZ_8M))
| 
| #define FIXADDR_TOP            (-GUARD_VA_SIZE)
| #define FIXADDR_SIZE_MAX       SZ_8M
| #define FIXADDR_START_MIN      (FIXADDR_TOP - FIXADDR_SIZE_MAX)
| 
| #define PCI_IO_END             (FIXADDR_START_MIN - GUARD_VA_SIZE)
| #define PCI_IO_START           (PCI_IO_END - PCI_IO_SIZE)
| 
| #define VMEMMAP_END            (ALIGN_DOWN(PCI_IO_START - GUARD_VA_SIZE, SZ_1G))
| #define VMEMMAP_START          (VMEMMAP_END - VMEMMAP_SIZE)

... and in fixmap.h have:

/* Ensure the estimate in memory.h was big enough */
static_assert(FIXADDR_SIZE_MAX > FIXADDR_SIZE);

I might be missing some reason why we can't do that; I locally tried the above
atop this series with defconfig+4K and defcconfig+64K, and both build and boot
without sisue.

Other than that, the series looks good to me.

If you're happy with the above I can go spin that as a patch to apply atop.

Mark.



More information about the linux-arm-kernel mailing list