[PATCH] arm64: kvm: handle 52-bit VA regions correctly under nVHE

Ard Biesheuvel ardb at kernel.org
Tue Mar 30 13:49:18 BST 2021


On Tue, 30 Mar 2021 at 14:44, Marc Zyngier <maz at kernel.org> wrote:
>
> On Tue, 30 Mar 2021 12:21:26 +0100,
> Ard Biesheuvel <ardb at kernel.org> wrote:
> >
> > Commit f4693c2716b35d08 ("arm64: mm: extend linear region for 52-bit VA
> > configurations") introduced a new layout for the 52-bit VA space, in
> > order to maximize the space available to the linear region. After this
> > change, the kernel VA space is no longer split 1:1 down the middle, and
> > as it turns out, this violates an assumption in the KVM init code when
> > it chooses the layout for the nVHE EL2 mapping.
> >
> > Given that EFI does not support 52-bit VA addressing (as it only
> > supports 4k pages), and that in general, loaders cannot assume that the
> > kernel being loaded supports 52-bit VA/PA addressing in the first place,
> > we can safely assume that the kernel, and therefore the .idmap section,
> > will be 48-bit addressable on 52-bit VA capable systems.
> >
> > So in this case, organize the nVHE EL2 address space as a 2^48 byte
> > window starting at address 0x0, containing the ID map and the
> > hypervisor's private mappings, followed by a contiguous 2^52 - 2^48 byte
> > linear region. (Note that EL1's linear region is 2^52 - 2^47 bytes in
> > size, so it is slightly larger, but this only matters on systems where
> > the DRAM footprint in the physical memory map exceeds 3968 TB)
>
> So if I have memory in the [2^52 - 2^48, 2^52 - 2^47] range, not
> necessarily because I have that much memory, but because my system has
> multiple memory banks, one of which lands on that spot, I cannot map
> such memory at EL2. We'll explode at run time.
>
> Can we keep the private mapping to 47 bits and restore the missing
> chunk to the linear mapping? Of course, it means that the linear map
> is now potential no linear anymore, so we'd have to garantee that the
> kernel lines in the first 2^47 bits instead. Crap.
>

Yeah. The linear region needs to be contiguous. Alternatively, we
could restrict the upper address limit for loading the kernel to 47
bits.

> >
> > Fixes: f4693c2716b35d08 ("arm64: mm: extend linear region for 52-bit VA configurations")
> > Signed-off-by: Ard Biesheuvel <ardb at kernel.org>
> > ---
> >  Documentation/arm64/booting.rst |  6 +++---
> >  arch/arm64/kvm/va_layout.c      | 18 ++++++++++++++----
> >  2 files changed, 17 insertions(+), 7 deletions(-)
> >
> > diff --git a/Documentation/arm64/booting.rst b/Documentation/arm64/booting.rst
> > index 7552dbc1cc54..418ec9b63d2c 100644
> > --- a/Documentation/arm64/booting.rst
> > +++ b/Documentation/arm64/booting.rst
> > @@ -121,8 +121,8 @@ Header notes:
> >                         to the base of DRAM, since memory below it is not
> >                         accessible via the linear mapping
> >                       1
> > -                       2MB aligned base may be anywhere in physical
> > -                       memory
> > +                       2MB aligned base may be anywhere in the 48-bit
> > +                       addressable physical memory region
> >    Bits 4-63  Reserved.
> >    ============= ===============================================================
> >
> > @@ -132,7 +132,7 @@ Header notes:
> >    depending on selected features, and is effectively unbound.
> >
> >  The Image must be placed text_offset bytes from a 2MB aligned base
> > -address anywhere in usable system RAM and called there. The region
> > +address in 48-bit addressable system RAM and called there. The region
> >  between the 2 MB aligned base address and the start of the image has no
> >  special significance to the kernel, and may be used for other purposes.
> >  At least image_size bytes from the start of the image must be free for
> > diff --git a/arch/arm64/kvm/va_layout.c b/arch/arm64/kvm/va_layout.c
> > index 978301392d67..e9ab449de197 100644
> > --- a/arch/arm64/kvm/va_layout.c
> > +++ b/arch/arm64/kvm/va_layout.c
> > @@ -62,9 +62,19 @@ __init void kvm_compute_layout(void)
> >       phys_addr_t idmap_addr = __pa_symbol(__hyp_idmap_text_start);
> >       u64 hyp_va_msb;
> >
> > -     /* Where is my RAM region? */
> > -     hyp_va_msb  = idmap_addr & BIT(vabits_actual - 1);
> > -     hyp_va_msb ^= BIT(vabits_actual - 1);
> > +     /*
> > +      * On LVA capable hardware, the kernel is guaranteed to reside
> > +      * in the 48-bit addressable part of physical memory, and so
> > +      * the idmap will be located there as well. Put the EL2 linear
> > +      * region right after it, where it can grow upward to fill the
> > +      * entire 52-bit VA region.
> > +      */
> > +     if (vabits_actual > VA_BITS_MIN) {
> > +             hyp_va_msb = BIT(VA_BITS_MIN);
> > +     } else {
> > +             hyp_va_msb  = idmap_addr & BIT(vabits_actual - 1);
> > +             hyp_va_msb ^= BIT(vabits_actual - 1);
> > +     }
> >
> >       tag_lsb = fls64((u64)phys_to_virt(memblock_start_of_DRAM()) ^
> >                       (u64)(high_memory - 1));
> > @@ -72,7 +82,7 @@ __init void kvm_compute_layout(void)
> >       va_mask = GENMASK_ULL(tag_lsb - 1, 0);
> >       tag_val = hyp_va_msb;
> >
> > -     if (IS_ENABLED(CONFIG_RANDOMIZE_BASE) && tag_lsb != (vabits_actual - 1)) {
> > +     if (IS_ENABLED(CONFIG_RANDOMIZE_BASE) && tag_lsb < (vabits_actual - 1)) {
> >               /* We have some free bits to insert a random tag. */
> >               tag_val |= get_random_long() & GENMASK_ULL(vabits_actual - 2, tag_lsb);
> >       }
>
> It seems __create_hyp_private mapping() still refers to (VA_BITS - 1)
> to choose where to allocate the IO mappings, and
> __pkvm_create_private_mapping() relies on similar things based on what
> hyp_create_idmap().
>

That was probably broken already then, given that it should refer to
vabits_actual. I'll address that in a separate patch.



More information about the linux-arm-kernel mailing list