[PATCH] arm64: kvm: handle 52-bit VA regions correctly under nVHE

Ard Biesheuvel ardb at kernel.org
Tue Mar 30 14:15:19 BST 2021


On Tue, 30 Mar 2021 at 15:04, Marc Zyngier <maz at kernel.org> wrote:
>
> On Tue, 30 Mar 2021 13:49:18 +0100,
> Ard Biesheuvel <ardb at kernel.org> wrote:
> >
> > On Tue, 30 Mar 2021 at 14:44, Marc Zyngier <maz at kernel.org> wrote:
> > >
> > > On Tue, 30 Mar 2021 12:21:26 +0100,
> > > Ard Biesheuvel <ardb at kernel.org> wrote:
> > > >
> > > > Commit f4693c2716b35d08 ("arm64: mm: extend linear region for 52-bit VA
> > > > configurations") introduced a new layout for the 52-bit VA space, in
> > > > order to maximize the space available to the linear region. After this
> > > > change, the kernel VA space is no longer split 1:1 down the middle, and
> > > > as it turns out, this violates an assumption in the KVM init code when
> > > > it chooses the layout for the nVHE EL2 mapping.
> > > >
> > > > Given that EFI does not support 52-bit VA addressing (as it only
> > > > supports 4k pages), and that in general, loaders cannot assume that the
> > > > kernel being loaded supports 52-bit VA/PA addressing in the first place,
> > > > we can safely assume that the kernel, and therefore the .idmap section,
> > > > will be 48-bit addressable on 52-bit VA capable systems.
> > > >
> > > > So in this case, organize the nVHE EL2 address space as a 2^48 byte
> > > > window starting at address 0x0, containing the ID map and the
> > > > hypervisor's private mappings, followed by a contiguous 2^52 - 2^48 byte
> > > > linear region. (Note that EL1's linear region is 2^52 - 2^47 bytes in
> > > > size, so it is slightly larger, but this only matters on systems where
> > > > the DRAM footprint in the physical memory map exceeds 3968 TB)
> > >
> > > So if I have memory in the [2^52 - 2^48, 2^52 - 2^47] range, not
> > > necessarily because I have that much memory, but because my system has
> > > multiple memory banks, one of which lands on that spot, I cannot map
> > > such memory at EL2. We'll explode at run time.
> > >
> > > Can we keep the private mapping to 47 bits and restore the missing
> > > chunk to the linear mapping? Of course, it means that the linear map
> > > is now potential no linear anymore, so we'd have to garantee that the
> > > kernel lines in the first 2^47 bits instead. Crap.
> > >
> >
> > Yeah. The linear region needs to be contiguous. Alternatively, we
> > could restrict the upper address limit for loading the kernel to 47
> > bits.
>
> Is that something we can do retroactively? We could mandate it for
> LVA systems only, but that's a bit odd.
>

Yeah, especially given the fact that LVA systems will be VHE capable
and may therefore not care in the first place.

On systems that have memory that high, EFI is likely to load the
kernel there, as it usually allocates from the top down, and it tries
to avoid having to move it around unless asked to (via KASLR), in
which case it will currently randomize over the entire available
memory space.

So it is going to add a special case for a corner^2 case, i.e., nVHE
on 52-bit/64k pages with more than 3968 TB distance between the start
and end of DRAM. Ugh.

It seems to me that the only way to solve this is to permit the idmap
and the hyp linear region to overlap, and use the 2^47 byte window at
the top of the address space for the hyp private mappings instead of
the one at the bottom.



More information about the linux-arm-kernel mailing list