[RFC PATCH 0/6] KVM: arm64: Errata management for VM Live migration

Shameerali Kolothum Thodi shameerali.kolothum.thodi at huawei.com
Fri Oct 11 03:57:10 PDT 2024



> -----Original Message-----
> From: Marc Zyngier <maz at kernel.org>
> Sent: Friday, October 11, 2024 11:37 AM
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi at huawei.com>
> Cc: kvmarm at lists.linux.dev; oliver.upton at linux.dev;
> catalin.marinas at arm.com; will at kernel.org; mark.rutland at arm.com;
> cohuck at redhat.com; eric.auger at redhat.com; yuzenghui
> <yuzenghui at huawei.com>; Wangzhou (B) <wangzhou1 at hisilicon.com>;
> jiangkunkun <jiangkunkun at huawei.com>; Jonathan Cameron
> <jonathan.cameron at huawei.com>; Anthony Jebson
> <anthony.jebson at huawei.com>; linux-arm-kernel at lists.infradead.org;
> Linuxarm <linuxarm at huawei.com>
> Subject: Re: [RFC PATCH 0/6] KVM: arm64: Errata management for VM Live
> migration
> 
> Hi Shameer,
> 
> Thanks for getting the ball rolling on this one, much appreciated.
> 
> On Fri, 11 Oct 2024 08:50:47 +0100,
> Shameer Kolothum <shameerali.kolothum.thodi at huawei.com> wrote:
> >
> > Hi,
> >
> > On ARM64 platforms most of the errata workarounds are based on CPU
> > MIDR/REVIDR values and a number of these workarounds need to be
> > implemented by the Guest kernel as well. This creates a problem when
> > Guest needs to be migrated to a platform that differs in these
> > MIDR/REVIDR values even if the VMM can come up with a common
> minimum
> > feature list for the Guest using the recently introduced "Writable
> > ID registers" support.
> >
> > (This is roughly based on a discussion I had with Marc and Oliver
> > at KVM forum. Marc outlined his idea for a solution and this is an
> > attempt to implement it. Thanks to both and I take all the blame
> > if this is nowhere near what is intended/required)
> >
> > This RFC proposes a solution to handle the above issue by introducing
> > the following,
> >
> > 1. A new VM IOCTL,
> >    KVM_ARM_SET_MIGRN_TARGET_CPUS  _IOW(KVMIO,  0xb7, struct
> kvm_arm_migrn_cpus)
> >    This can be used by the userspace(VMM) to set the target CPUs the
> >    Guest will run in its lifetime. See patch #2
> > 2. Add hypercall support for Guest kernel to retrieve any migration
> >    errata bitmap(ARM_SMCCC_VENDOR_HYP_KVM_MIGRN_ERRATA)
> >    The above will return the bitmaps in R0-R3 registers. See patch #4
> > 3. The "capability" field in struct arm64_cpu_capabilities is a generated
> >    one at present and may get renumbered or reordered. Hence, we can't
> use
> >    this directly for migration errata bitmaps. Instead, introduced
> >    "migartion_safe_cap", which has to be set statically for any
> >    erratum that needs to be enabled and is safe for migration
> >    purposes. See patches 3 & 6.
> > 4. Rest of the patches includes the plumbing required to populate the
> >    errata bitmap based on the target CPUs set by the VMM and update the
> >    system_cap based on it.
> >
> > ToDos:-
> >   -We still need a way to  handle the error in setting the invariant
> >    registers(MIDR/REVIDR/AIDR) during Guest migration. Perhaps we can
> >    handle it in userspace?
> > -  Possibly we could do better to avoid the additional
> "migartion_safe_cap" use.
> >    Suggestions welcome.
> >   -There are errata that require more than MIDR/REVIDR, eg: CTR_EL0.
> >    How to handle those?
> >   -Check for locking requirements if any.
> >
> > This is lightly tested on a HiSilicon ARM64 platform.
> >
> > Please take a look and let me know your thoughts.
> 
> Having eyeballed this very superficially, I think we can do something
> simpler, and maybe more future-proof:

Thanks Marc for taking a look and the quick feedback.

> 
> - I don't think KVM should be concerned about the description of the
>   target CPUs. The hypercall you defined is the right thing to do,
>   but the VMM should completely handle it. That's an implementation
>   detail, but it would make things much simpler.

Ok. So does that mean the hypercall will use some sort of shared memory
to retrieve the list of target CPUs from VMM?
 
> - I don't think the "errata bitmap" works. That's a construct that is
>   specific to Linux, and that cannot be supported for other OSs. It
>   also limits the described issues to those the host knows, instead of
>   the guest. The host doesn't have a clue what the guest really wants.
>   Really, the guest should have enough information to decide what to
>   do based on its own view of the ID registers and the list of CPUs it
>   runs on.

Yes. "errata bitmap" is specific to Linux. So if we go with the above 
hypercall-->VMM path and get the target CPU list, Guest can directly
use that.

> 
> - To answer your question about CTR_EL0: KVM should (and does)
>   sanitise that register by trapping it. This should be the default
>   behaviour for things that need to be mitigated outside of
>   MIDR/REVIDR.

Ok. Make sense and simplifies things.

Please let me know whether my understanding on hypercall<->VMM is 
correct or not. I can take a look at that route.

Thanks,
Shameer




More information about the linux-arm-kernel mailing list