[PATCH] KVM: arm64: Cap default IPA size to the host's own size

Marc Zyngier maz at kernel.org
Tue Mar 9 14:46:29 GMT 2021


On Tue, 09 Mar 2021 14:29:10 +0000,
Andrew Jones <drjones at redhat.com> wrote:
> 
> On Tue, Mar 09, 2021 at 01:43:40PM +0000, Marc Zyngier wrote:
> > Hi Andrew,
> > 
> > On Tue, 09 Mar 2021 13:20:21 +0000,
> > Andrew Jones <drjones at redhat.com> wrote:
> > > 
> > > Hi Marc,
> > > 
> > > On Mon, Mar 08, 2021 at 05:46:43PM +0000, Marc Zyngier wrote:
> > > > KVM/arm64 has forever used a 40bit default IPA space, partially
> > > > due to its 32bit heritage (where the only choice is 40bit).
> > > > 
> > > > However, there are implementations in the wild that have a *cough*
> > > > much smaller *cough* IPA space, which leads to a misprogramming of
> > > > VTCR_EL2, and a guest that is stuck on its first memory access
> > > > if userspace dares to ask for the default IPA setting (which most
> > > > VMMs do).
> > > > 
> > > > Instead, cap the default IPA size to what the host can actually
> > > > do, and spit out a one-off message on the console. The boot warning
> > > > is turned into a more meaningfull message, and the new behaviour
> > > > is also documented.
> > > > 
> > > > Although this is a userspace ABI change, it doesn't really change
> > > > much for userspace:
> > > > 
> > > > - the guest couldn't run before this change, while it now has
> > > >   a chance to if the memory range fits the reduced IPA space
> > > > 
> > > > - a memory slot that was accepted because it did fit the default
> > > >   IPA space but didn't fit the HW constraints is now properly
> > > >   rejected
> > > 
> > > I'm not sure deferring the misconfiguration error until memslot
> > > request time is better than just failing to create a VM. If
> > > userspace doesn't use KVM_CAP_ARM_VM_IPA_SIZE to determine the
> > > limit (which it hasn't been obliged to do) and it is able to
> > > successfully create a VM, then it will assume up to 40-bit IPAs
> > > are supported. Later, when it tries to add memslots and fails
> > > it may be confused, especially if that later is much, much later
> > > with memory hotplug.
> > 
> > That's a fair point. However, no existing userspace will work on these
> > systems. Is that what we want to do? I don't care much, but having
> > non-usable defaults feel a bit... odd. I do spit out a warning, but I
> > agree this isn't great either.
> 
> I can send patches for QEMU, KVM selftests, and maybe even rust-vmm.
> Can you point me to something about these systems I can reference
> in my postings? Or I can just reference this mail thread.

The system of choice to see this is an Apple M1 box. Not supported in
mainline yet, but things are progressing pretty quickly.

> 
> > 
> > > > The other thing that's left doing is to convince userspace to
> > > > actually use the IPA space setting instead of relying on the
> > > > antiquated default.
> > > 
> > > Failing to create any VM which hasn't selected a valid IPA limit
> > > should be pretty convincing :-)
> > 
> > I'll make sure to redirect the reports your way! :D
> 
> What's the current error message when this occurs? Is it good enough, or
> should we improve it to help provide people hints? Please don't change
> it to "Invalid IPA limit, please mail Andrew Jones" :-)

Well, that's part of the problem. Currently, you don't get a message,
and the guest faults on its first memory access forever (level 0
translation fault), as the VTCR_EL2.T0SZ value is bogus.

I can change this patch to reject 40bit IPA when requested as a
default with something saying "Userspace using unsupported default IPA
limit, upgrade your VMM".

Now, there is another nit[1] which I just found with my kvmtool setup
that computes the optimal IPA space for a given VM. And that one is
even more problematic...

Thanks,

	M.

[1] https://lore.kernel.org/r/87lfawxv40.wl-maz@kernel.org

-- 
Without deviation from the norm, progress is not possible.



More information about the linux-arm-kernel mailing list