[PATCH RFC] KVM: arm64: allow ID_MMFR4_EL1 to be writable

Wed May 8 05:06:36 PDT 2024

On Thu, May 02 2024, Oliver Upton <oliver.upton at linux.dev> wrote:

> On Thu, May 02, 2024 at 03:40:38PM +0100, Russell King (Oracle) wrote:
>> On Wed, May 01, 2024 at 06:59:17PM +0000, Oliver Upton wrote:
>> > On Wed, May 01, 2024 at 07:08:05PM +0100, Russell King (Oracle) wrote:
>> > > Yes, it did strike me as odd, since the description seems to imply that
>> > > XNX affects EL2, which the VM wouldn't have access to. So I'm not sure
>> > > why we don't just force it to zero.
>> > 
>> > Probably because we failed to catch it in the first place and setting to
>> > 0 now would be even more UAPI breakage. Meh :-/ I don't see any immediate
>> > issues with the patch, especially since it is fixing a genuine UAPI
>> > breakage in KVM.
>> 
>> I think the only two ways around this would be to:
>> 
>> 1) teach QEMU about the contents of these registers, with which fields
>>    in these registers can be ignored when reloading a VMs context.
>> 
>> 2) allow userspace to write to the XNX field such that it can be set
>>    to values seen with previous kernels (thus allowing at least one-
>>    way migration.)
>> 
>> (1) has the advantage that reloading a VM state on older vs newer
>> kernels can work in either direction, whereas (2) would only work
>> for state saved on an older kernel loaded onto a newer kernel.
>
> Yeah, so this is something that has affected my employer as well.
>
> I think (1) should only be expected of VMMs that want rollback safety,
> i.e. the ability to migrate state back to an older kernel. Our userspace
> initializes vCPUs from a fixed set of feature ID register values that
> prevents VMs on new kernels from picking up new CPU features.
>
> It is quite tedious, but necessary as rollback safety is very much a
> non-goal of the KVM UAPI.

Depending on your use case, rollback safety might be quite
important... have we ever stated exactly which guarantees the KVM UAPI
is giving? IOW, can someone implementing a VMM look at a doc and see
"oh, if I want to be able to go backwards, I need to be able to deal
with x, y, and z coming up on the new kernel"?

>
> OTOH, in cases where KVM screws up and breaks UAPI, the kernel needs to
> do something special to accept the previously-advertised state even if
> it were nonsensical.
>
> For example, there was a bug where KVM advertised an IMP DEF PMU to VMs
> even though the only thing KVM virtualizes is PMUv3. We fixed it in
> commit f90f9360c3d7 ("KVM: arm64: Rewrite IMPDEF PMU version as NI") by
> accepting the old value in the ioctl and changing the field to NI
> internally.
>
> I dislike these sort of hacks, but when we're caught between upholding
> UAPI and the architecture it seems to be the best option. I wonder if an
> approach similar to this would be sufficient to address the SPE change
> that you noticed.
>
>> I've been bitten before with KVM differences between kernel versions
>> in the past - where the number of registers that userspace sees has
>> changed despite being on the same hardware.
>
> This is intended behavior, as VMs are initialized to the maximum feature
> set KVM is able to support. Forward-compatibility for the set of exposed
> registers is tested, see the get-reg-list selftest.

I've seen this problem come up as well; if it is clear that this is
something that KVM expects the VMM to handle if needed, that is fine;
should we consider "it's tested in a selftest" as a canonical indicator
of "this is what KVM supports compatibility wise"?