[PATCH] arm64: kvm: handle 52-bit VA regions correctly under nVHE
Ard Biesheuvel
ardb at kernel.org
Tue Mar 30 13:49:18 BST 2021
On Tue, 30 Mar 2021 at 14:44, Marc Zyngier <maz at kernel.org> wrote:
>
> On Tue, 30 Mar 2021 12:21:26 +0100,
> Ard Biesheuvel <ardb at kernel.org> wrote:
> >
> > Commit f4693c2716b35d08 ("arm64: mm: extend linear region for 52-bit VA
> > configurations") introduced a new layout for the 52-bit VA space, in
> > order to maximize the space available to the linear region. After this
> > change, the kernel VA space is no longer split 1:1 down the middle, and
> > as it turns out, this violates an assumption in the KVM init code when
> > it chooses the layout for the nVHE EL2 mapping.
> >
> > Given that EFI does not support 52-bit VA addressing (as it only
> > supports 4k pages), and that in general, loaders cannot assume that the
> > kernel being loaded supports 52-bit VA/PA addressing in the first place,
> > we can safely assume that the kernel, and therefore the .idmap section,
> > will be 48-bit addressable on 52-bit VA capable systems.
> >
> > So in this case, organize the nVHE EL2 address space as a 2^48 byte
> > window starting at address 0x0, containing the ID map and the
> > hypervisor's private mappings, followed by a contiguous 2^52 - 2^48 byte
> > linear region. (Note that EL1's linear region is 2^52 - 2^47 bytes in
> > size, so it is slightly larger, but this only matters on systems where
> > the DRAM footprint in the physical memory map exceeds 3968 TB)
>
> So if I have memory in the [2^52 - 2^48, 2^52 - 2^47] range, not
> necessarily because I have that much memory, but because my system has
> multiple memory banks, one of which lands on that spot, I cannot map
> such memory at EL2. We'll explode at run time.
>
> Can we keep the private mapping to 47 bits and restore the missing
> chunk to the linear mapping? Of course, it means that the linear map
> is now potential no linear anymore, so we'd have to garantee that the
> kernel lines in the first 2^47 bits instead. Crap.
>
Yeah. The linear region needs to be contiguous. Alternatively, we
could restrict the upper address limit for loading the kernel to 47
bits.
> >
> > Fixes: f4693c2716b35d08 ("arm64: mm: extend linear region for 52-bit VA configurations")
> > Signed-off-by: Ard Biesheuvel <ardb at kernel.org>
> > ---
> > Documentation/arm64/booting.rst | 6 +++---
> > arch/arm64/kvm/va_layout.c | 18 ++++++++++++++----
> > 2 files changed, 17 insertions(+), 7 deletions(-)
> >
> > diff --git a/Documentation/arm64/booting.rst b/Documentation/arm64/booting.rst
> > index 7552dbc1cc54..418ec9b63d2c 100644
> > --- a/Documentation/arm64/booting.rst
> > +++ b/Documentation/arm64/booting.rst
> > @@ -121,8 +121,8 @@ Header notes:
> > to the base of DRAM, since memory below it is not
> > accessible via the linear mapping
> > 1
> > - 2MB aligned base may be anywhere in physical
> > - memory
> > + 2MB aligned base may be anywhere in the 48-bit
> > + addressable physical memory region
> > Bits 4-63 Reserved.
> > ============= ===============================================================
> >
> > @@ -132,7 +132,7 @@ Header notes:
> > depending on selected features, and is effectively unbound.
> >
> > The Image must be placed text_offset bytes from a 2MB aligned base
> > -address anywhere in usable system RAM and called there. The region
> > +address in 48-bit addressable system RAM and called there. The region
> > between the 2 MB aligned base address and the start of the image has no
> > special significance to the kernel, and may be used for other purposes.
> > At least image_size bytes from the start of the image must be free for
> > diff --git a/arch/arm64/kvm/va_layout.c b/arch/arm64/kvm/va_layout.c
> > index 978301392d67..e9ab449de197 100644
> > --- a/arch/arm64/kvm/va_layout.c
> > +++ b/arch/arm64/kvm/va_layout.c
> > @@ -62,9 +62,19 @@ __init void kvm_compute_layout(void)
> > phys_addr_t idmap_addr = __pa_symbol(__hyp_idmap_text_start);
> > u64 hyp_va_msb;
> >
> > - /* Where is my RAM region? */
> > - hyp_va_msb = idmap_addr & BIT(vabits_actual - 1);
> > - hyp_va_msb ^= BIT(vabits_actual - 1);
> > + /*
> > + * On LVA capable hardware, the kernel is guaranteed to reside
> > + * in the 48-bit addressable part of physical memory, and so
> > + * the idmap will be located there as well. Put the EL2 linear
> > + * region right after it, where it can grow upward to fill the
> > + * entire 52-bit VA region.
> > + */
> > + if (vabits_actual > VA_BITS_MIN) {
> > + hyp_va_msb = BIT(VA_BITS_MIN);
> > + } else {
> > + hyp_va_msb = idmap_addr & BIT(vabits_actual - 1);
> > + hyp_va_msb ^= BIT(vabits_actual - 1);
> > + }
> >
> > tag_lsb = fls64((u64)phys_to_virt(memblock_start_of_DRAM()) ^
> > (u64)(high_memory - 1));
> > @@ -72,7 +82,7 @@ __init void kvm_compute_layout(void)
> > va_mask = GENMASK_ULL(tag_lsb - 1, 0);
> > tag_val = hyp_va_msb;
> >
> > - if (IS_ENABLED(CONFIG_RANDOMIZE_BASE) && tag_lsb != (vabits_actual - 1)) {
> > + if (IS_ENABLED(CONFIG_RANDOMIZE_BASE) && tag_lsb < (vabits_actual - 1)) {
> > /* We have some free bits to insert a random tag. */
> > tag_val |= get_random_long() & GENMASK_ULL(vabits_actual - 2, tag_lsb);
> > }
>
> It seems __create_hyp_private mapping() still refers to (VA_BITS - 1)
> to choose where to allocate the IO mappings, and
> __pkvm_create_private_mapping() relies on similar things based on what
> hyp_create_idmap().
>
That was probably broken already then, given that it should refer to
vabits_actual. I'll address that in a separate patch.
More information about the linux-arm-kernel
mailing list