[RFC PATCH 0/6] KVM: arm64: Errata management for VM Live migration
Shameerali Kolothum Thodi
shameerali.kolothum.thodi at huawei.com
Fri Oct 11 03:57:10 PDT 2024
> -----Original Message-----
> From: Marc Zyngier <maz at kernel.org>
> Sent: Friday, October 11, 2024 11:37 AM
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi at huawei.com>
> Cc: kvmarm at lists.linux.dev; oliver.upton at linux.dev;
> catalin.marinas at arm.com; will at kernel.org; mark.rutland at arm.com;
> cohuck at redhat.com; eric.auger at redhat.com; yuzenghui
> <yuzenghui at huawei.com>; Wangzhou (B) <wangzhou1 at hisilicon.com>;
> jiangkunkun <jiangkunkun at huawei.com>; Jonathan Cameron
> <jonathan.cameron at huawei.com>; Anthony Jebson
> <anthony.jebson at huawei.com>; linux-arm-kernel at lists.infradead.org;
> Linuxarm <linuxarm at huawei.com>
> Subject: Re: [RFC PATCH 0/6] KVM: arm64: Errata management for VM Live
> migration
>
> Hi Shameer,
>
> Thanks for getting the ball rolling on this one, much appreciated.
>
> On Fri, 11 Oct 2024 08:50:47 +0100,
> Shameer Kolothum <shameerali.kolothum.thodi at huawei.com> wrote:
> >
> > Hi,
> >
> > On ARM64 platforms most of the errata workarounds are based on CPU
> > MIDR/REVIDR values and a number of these workarounds need to be
> > implemented by the Guest kernel as well. This creates a problem when
> > Guest needs to be migrated to a platform that differs in these
> > MIDR/REVIDR values even if the VMM can come up with a common
> minimum
> > feature list for the Guest using the recently introduced "Writable
> > ID registers" support.
> >
> > (This is roughly based on a discussion I had with Marc and Oliver
> > at KVM forum. Marc outlined his idea for a solution and this is an
> > attempt to implement it. Thanks to both and I take all the blame
> > if this is nowhere near what is intended/required)
> >
> > This RFC proposes a solution to handle the above issue by introducing
> > the following,
> >
> > 1. A new VM IOCTL,
> > KVM_ARM_SET_MIGRN_TARGET_CPUS _IOW(KVMIO, 0xb7, struct
> kvm_arm_migrn_cpus)
> > This can be used by the userspace(VMM) to set the target CPUs the
> > Guest will run in its lifetime. See patch #2
> > 2. Add hypercall support for Guest kernel to retrieve any migration
> > errata bitmap(ARM_SMCCC_VENDOR_HYP_KVM_MIGRN_ERRATA)
> > The above will return the bitmaps in R0-R3 registers. See patch #4
> > 3. The "capability" field in struct arm64_cpu_capabilities is a generated
> > one at present and may get renumbered or reordered. Hence, we can't
> use
> > this directly for migration errata bitmaps. Instead, introduced
> > "migartion_safe_cap", which has to be set statically for any
> > erratum that needs to be enabled and is safe for migration
> > purposes. See patches 3 & 6.
> > 4. Rest of the patches includes the plumbing required to populate the
> > errata bitmap based on the target CPUs set by the VMM and update the
> > system_cap based on it.
> >
> > ToDos:-
> > -We still need a way to handle the error in setting the invariant
> > registers(MIDR/REVIDR/AIDR) during Guest migration. Perhaps we can
> > handle it in userspace?
> > - Possibly we could do better to avoid the additional
> "migartion_safe_cap" use.
> > Suggestions welcome.
> > -There are errata that require more than MIDR/REVIDR, eg: CTR_EL0.
> > How to handle those?
> > -Check for locking requirements if any.
> >
> > This is lightly tested on a HiSilicon ARM64 platform.
> >
> > Please take a look and let me know your thoughts.
>
> Having eyeballed this very superficially, I think we can do something
> simpler, and maybe more future-proof:
Thanks Marc for taking a look and the quick feedback.
>
> - I don't think KVM should be concerned about the description of the
> target CPUs. The hypercall you defined is the right thing to do,
> but the VMM should completely handle it. That's an implementation
> detail, but it would make things much simpler.
Ok. So does that mean the hypercall will use some sort of shared memory
to retrieve the list of target CPUs from VMM?
> - I don't think the "errata bitmap" works. That's a construct that is
> specific to Linux, and that cannot be supported for other OSs. It
> also limits the described issues to those the host knows, instead of
> the guest. The host doesn't have a clue what the guest really wants.
> Really, the guest should have enough information to decide what to
> do based on its own view of the ID registers and the list of CPUs it
> runs on.
Yes. "errata bitmap" is specific to Linux. So if we go with the above
hypercall-->VMM path and get the target CPU list, Guest can directly
use that.
>
> - To answer your question about CTR_EL0: KVM should (and does)
> sanitise that register by trapping it. This should be the default
> behaviour for things that need to be mitigated outside of
> MIDR/REVIDR.
Ok. Make sense and simplifies things.
Please let me know whether my understanding on hypercall<->VMM is
correct or not. I can take a look at that route.
Thanks,
Shameer
More information about the linux-arm-kernel
mailing list