[PATCH v7 5/7] arm64: vmemmap: Avoid base2 order of struct page size to dimension region

Ard Biesheuvel ardb at kernel.org
Wed Dec 13 06:09:54 PST 2023


On Wed, 13 Dec 2023 at 14:49, Mark Rutland <mark.rutland at arm.com> wrote:
>
> Hi Ard,
>
> On Wed, Dec 13, 2023 at 09:40:30AM +0100, Ard Biesheuvel wrote:
> > From: Ard Biesheuvel <ardb at kernel.org>
> >
> > The placement and size of the vmemmap region in the kernel virtual
> > address space is currently derived from the base2 order of the size of a
> > struct page. This makes for nicely aligned constants with lots of
> > leading 0xf and trailing 0x0 digits, but given that the actual struct
> > pages are indexed as an ordinary array, this resulting region is
> > severely overdimensioned when the size of a struct page is just over a
> > power of 2.
> >
> > This doesn't matter today, but once we enable 52-bit virtual addressing
> > for 4k pages configurations, the vmemmap region may take up almost half
> > of the upper VA region with the current struct page upper bound at 64
> > bytes. And once we enable KMSAN or other features that push the size of
> > a struct page over 64 bytes, we will run out of VMALLOC space entirely.
> >
> > So instead, let's derive the region size from the actual size of a
> > struct page, and place the entire region 1 GB from the top of the VA
> > space, where it still doesn't share any lower level translation table
> > entries with the fixmap.
> >
> > Acked-by: Mark Rutland <mark.rutland at arm.com>
> > Signed-off-by: Ard Biesheuvel <ardb at kernel.org>
> > ---
> >  arch/arm64/include/asm/memory.h | 8 ++++----
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> >
> > diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
> > index 2745bed8ae5b..b49575a92afc 100644
> > --- a/arch/arm64/include/asm/memory.h
> > +++ b/arch/arm64/include/asm/memory.h
> > @@ -30,8 +30,8 @@
> >   * keep a constant PAGE_OFFSET and "fallback" to using the higher end
> >   * of the VMEMMAP where 52-bit support is not available in hardware.
> >   */
> > -#define VMEMMAP_SHIFT        (PAGE_SHIFT - STRUCT_PAGE_MAX_SHIFT)
> > -#define VMEMMAP_SIZE ((_PAGE_END(VA_BITS_MIN) - PAGE_OFFSET) >> VMEMMAP_SHIFT)
> > +#define VMEMMAP_RANGE        (_PAGE_END(VA_BITS_MIN) - PAGE_OFFSET)
> > +#define VMEMMAP_SIZE ((VMEMMAP_RANGE >> PAGE_SHIFT) * sizeof(struct page))
> >
> >  /*
> >   * PAGE_OFFSET - the virtual address of the start of the linear map, at the
> > @@ -47,8 +47,8 @@
> >  #define MODULES_END          (MODULES_VADDR + MODULES_VSIZE)
> >  #define MODULES_VADDR                (_PAGE_END(VA_BITS_MIN))
> >  #define MODULES_VSIZE                (SZ_2G)
> > -#define VMEMMAP_START                (-(UL(1) << (VA_BITS - VMEMMAP_SHIFT)))
> > -#define VMEMMAP_END          (VMEMMAP_START + VMEMMAP_SIZE)
> > +#define VMEMMAP_START                (VMEMMAP_END - VMEMMAP_SIZE)
> > +#define VMEMMAP_END          (-UL(SZ_1G))
> >  #define PCI_IO_START         (VMEMMAP_END + SZ_8M)
> >  #define PCI_IO_END           (PCI_IO_START + PCI_IO_SIZE)
> >  #define FIXADDR_TOP          (-UL(SZ_8M))
>
> I realise I've acked this already, but my big concern here is still that it's
> hard to see why these don't overlap (though the assert in fixmap.c will save
> us). Usually we try to make that clear by construction, and I think we can do
> that here with something like:
>
> | #define GUARD_VA_SIZE          (UL(SZ_8M))
> |
> | #define FIXADDR_TOP            (-GUARD_VA_SIZE)
> | #define FIXADDR_SIZE_MAX       SZ_8M
> | #define FIXADDR_START_MIN      (FIXADDR_TOP - FIXADDR_SIZE_MAX)
> |
> | #define PCI_IO_END             (FIXADDR_START_MIN - GUARD_VA_SIZE)
> | #define PCI_IO_START           (PCI_IO_END - PCI_IO_SIZE)
> |
> | #define VMEMMAP_END            (ALIGN_DOWN(PCI_IO_START - GUARD_VA_SIZE, SZ_1G))
> | #define VMEMMAP_START          (VMEMMAP_END - VMEMMAP_SIZE)
>
> ... and in fixmap.h have:
>
> /* Ensure the estimate in memory.h was big enough */
> static_assert(FIXADDR_SIZE_MAX > FIXADDR_SIZE);
>
> I might be missing some reason why we can't do that; I locally tried the above
> atop this series with defconfig+4K and defcconfig+64K, and both build and boot
> without sisue.
>

I am really struggling to understand what the issue is that you are
trying to solve here.

What I am proposing is

#define VMEMMAP_END          (-UL(SZ_1G))
#define PCI_IO_START         (VMEMMAP_END + SZ_8M)
#define PCI_IO_END           (PCI_IO_START + PCI_IO_SIZE)
#define FIXADDR_TOP          (-UL(SZ_8M))

and in fixmap.c

static_assert(FIXADDR_TOT_START > PCI_IO_END);

(which sadly has to live outside of asm/memory.h for reasons of #include hell).

This leaves the top 1G for PCI I/O plus fixmap, and ensures that the
latter does not collide with the former.

> Other than that, the series looks good to me.
>

Thanks.

> If you're happy with the above I can go spin that as a patch to apply atop.
>

I won't object to that but I can't say I am convinced of the need.

Thanks,



More information about the linux-arm-kernel mailing list