KVM/arm64: Guest ABI changes do not appear rollback-safe

Tue Feb 8 01:46:16 PST 2022

Huh, somewhat missed that email, apologies for the delay.

On Tue, 25 Jan 2022 17:29:13 +0000,
Oliver Upton <oupton at google.com> wrote:
> 
> Hi Marc,
> 
> On Tue, Jan 25, 2022 at 12:46 AM Marc Zyngier <maz at kernel.org> wrote:
> > > If I understand correctly, the original motivation for going with
> > > pseudo-registers was to comply with QEMU, which uses KVM_GET_REG_LIST
> > > and KVM_[GET|SET]_ONE_REG interface, but I'm guessing the VMMs doing
> > > save/restore across migration might write the same values for every
> > > vCPU.
> >
> > KVM currently restricts the vcpu features to be unified across vcpus,
> > but that's only an implementation choice.
> 
> But that implementation choice has become ABI, no? How could support
> for asymmetry be added without requiring userspace opt-in or breaking
> existing VMMs that depend on feature unification?

Of course, you'd need some sort of advertising of this new behaviour.

One thing I would like to add to the current state of thing is an
indication of whether the effects of a sysreg being written from
userspace are global or local to a vcpu. You'd need a new capability,
and an extra flag added to the encoding of each register.

> 
> > The ARM architecture doesn't
> > mandate that these registers are all the same, and it isn't impossible
> > that we'd allow for the feature set to become per-vcpu at some point
> > in time. So this argument doesn't really hold.
> 
> Accessing per-VM state N times is bound to increase VM blackout time
> during migrations ~linearly as the number of vCPUs in a VM increases,
> since a VM scoped lock is necessary to serialize guest accesses. It
> could be tolerable at present scale, but seems like in the future it
> could become a real problem.

I don't disagree. But I doubt switching to a different API altogether
is the solution to this.

> 
> > Furthermore, compatibility with QEMU's save/restore model is
> > essential, and AFAICT, there is no open source alternative.
> 
> Agree fundamentally, but I believe it is entirely reasonable to
> require a userspace change to adopt a new KVM feature. Otherwise, we
> may be trying to shoehorn new features into existing UAPI that may not
> be a precise fit..

But the very purpose of this API is to support discoverability. If we
can't support it, then we might as well declare the whole API
deprecated and restart from scratch.

No, I'm not seriously suggesting this :-/.

> In order to cure the serialization mentioned above, two options are
> top of mind: accessing the VM state with the VM FD or informing
> userspace that a set of registers need only be written once for an
> entire VM. If we add support for asymmetry later down the road, that
> would become an opt-in such that userspace will do the access
> per-vCPU.

It is the latter that I'm suggesting.

>
> > A device means yet another configuration and migration API. Don't you
> > think we have enough of those? The complexity of KVM/arm64 userspace
> > API is already insane, and extremely fragile. Adding to it will be a
> > validation nightmare (it already is, and I don't see anyone actively
> > helping with it).
> 
> It seems equally fragile to introduce VM-wide serialization to vCPU
> UAPI that we know is in the live migration critical path for _any_
> VMM. Without requiring userspace changes for all the new widgets under
> discussion we're effectively forcing VMMs to do something suboptimal.

I'm perfectly happy with suboptimal to start with, and find ways to
make it better once we have a clear path forward. I just don't want to
conflate the two.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.