[PATCH 4/4] KVM: arm64: Refuse to run VCPU if the PMU doesn't match the physical CPU

Mon Nov 22 06:43:17 PST 2021

Hi Marc,

On Mon, Nov 22, 2021 at 02:21:00PM +0000, Marc Zyngier wrote:
> On Mon, 22 Nov 2021 12:12:17 +0000,
> Alexandru Elisei <alexandru.elisei at arm.com> wrote:
> > 
> > Hi Marc,
> > 
> > On Sun, Nov 21, 2021 at 07:35:13PM +0000, Marc Zyngier wrote:
> > > On Mon, 15 Nov 2021 16:50:41 +0000,
> > > Alexandru Elisei <alexandru.elisei at arm.com> wrote:
> > > > 
> > > > Userspace can assign a PMU to a VCPU with the KVM_ARM_VCPU_PMU_V3_SET_PMU
> > > > device ioctl. If the VCPU is scheduled on a physical CPU which has a
> > > > different PMU, the perf events needed to emulate a guest PMU won't be
> > > > scheduled in and the guest performance counters will stop counting. Treat
> > > > it as an userspace error and refuse to run the VCPU in this situation.
> > > > 
> > > > The VCPU is flagged as being scheduled on the wrong CPU in vcpu_load(), but
> > > > the flag is cleared when the KVM_RUN enters the non-preemptible section
> > > > instead of in vcpu_put(); this has been done on purpose so the error
> > > > condition is communicated as soon as possible to userspace, otherwise
> > > > vcpu_load() on the wrong CPU followed by a vcpu_put() could clear the flag.
> > > 
> > > Can we make this something orthogonal to the PMU, and get userspace to
> > > pick an affinity mask independently of instantiating a PMU? I can
> > > imagine this would also be useful for SPE on asymmetric systems.
> > 
> > I actually went this way for the latest version of the SPE series [1] and
> > dropped the explicit userspace ioctl in favor of this mechanism.
> > 
> > The expectation is that userspace already knows which CPUs are associated
> > with the chosen PMU (or SPE) when setting the PMU for the VCPU, and having
> > userspace set it explicitely via an ioctl looks like an unnecessary step to
> > me. I don't see other usecases of an explicit ioctl outside of the above
> > two situation (if userspace wants a VCPU to run only on specific CPUs, it
> > can use thread affinity for that), so I decided to drop it.
> 
> My problem with that is that if you have (for whatever reason) a set
> of affinities that are not strictly identical for both PMU and SPE,
> and expose both of these to a guest, what do you choose?
> 
> As long as you have a single affinity set to take care of, you're
> good. It is when you have several ones that it becomes ugly (as with
> anything involving asymmetric CPUs).

I thought about it when I decided to do it this way, my solution was to do
a cpumask_and() with the existing VCPU cpumask when setting a VCPU feature
that requires it, and returning an error if we get an empty cpumask,
because userspace is requesting a combination of VCPU features that is not
supported by the hardware.

Going with the other solution (user sets the cpumask via an ioctl), KVM
would still have to check against certain combinations of VCPU features
(for SPE it's mandatory, so KVM doesn't trigger an undefined exception, we
could skip the check for PMU, but then what do we gain from the ioctl if
KVM doesn't check that it matches the PMU?), so I don't think we loose
anything by going with the implicit cpumask.

What do you think?

Thanks,
Alex

> 
> 	M.
> 
> -- 
> Without deviation from the norm, progress is not possible.