[PATCH RFC] KVM: arm64: allow ID_MMFR4_EL1 to be writable

Fri May 10 08:11:09 PDT 2024

On Wed, May 08 2024, Oliver Upton <oliver.upton at linux.dev> wrote:

> Hi Cornelia,
>
> On Wed, May 08, 2024 at 02:06:36PM +0200, Cornelia Huck wrote:
>> On Thu, May 02 2024, Oliver Upton <oliver.upton at linux.dev> wrote:
>> > I think (1) should only be expected of VMMs that want rollback safety,
>> > i.e. the ability to migrate state back to an older kernel. Our userspace
>> > initializes vCPUs from a fixed set of feature ID register values that
>> > prevents VMs on new kernels from picking up new CPU features.
>> >
>> > It is quite tedious, but necessary as rollback safety is very much a
>> > non-goal of the KVM UAPI.
>> 
>> Depending on your use case, rollback safety might be quite
>> important... have we ever stated exactly which guarantees the KVM UAPI
>> is giving? IOW, can someone implementing a VMM look at a doc and see
>> "oh, if I want to be able to go backwards, I need to be able to deal
>> with x, y, and z coming up on the new kernel"?
>
> The behavior of KVM/arm64 has always been that new VMs get the maximum
> set of vCPU features supported by KVM / hardware modulo the ones we
> require explicit opt-in from userspace (e.g. SVE, vPMU). This behavior
> is described in the arm64 vCPU feature documentation [1].
>
> The biggest benefit of this approach is that new vCPU features can be
> added without a VMM change, as userspace can just treat the registers in
> KVM_GET_REG_LIST as an opaque blob of data that needs to be migrated.

It also needs to actively turn off everything it does not know how to
handle, if migration between different hardware is supposed to be
supported.

>
> I'm willing to wager that the set of users who want to migrate state
> from kernel N -> (N - 1) know the exact CPU definition they want to
> expose to the guest, and in that case should be using a static set of
> feature register values matching their intent.

I think the trouble starts when we introduce additional ranges of
registers that can be controled via that interface -- old userspace will
be able to figure out that there are more ranges available than what it
is aware of, but will have no idea how to handle those additional ranges
to get into a defined state (what is the actual range, for example?)

>
> Beyond the CPU architecture, KVM presents hypercall features to the VM
> which userspace can _opt-out_ of on a per-feature basis using the
> feature bitmap registers described in [2]. Like the feature ID
> registers, we've preallocated a range of indices to be used for
> hypercall bitmaps. So if an unexpected bitmap register appears on a new
> kernel, userspace should write 0 to it to prevent new features from
> silently creeping in.

Hm, the doc says: "The features for the registers that are untouched,
probably because userspace isn’t aware of them, will be exposed as is to
the guest." Seems to indicate that it is not too hard to get this wrong :)

>
> Prescribing the exact combination of these UAPIs to achieve a
> rollback-safe feature set is beyond the scope of the KVM documentation
> and should be determined based on the minimum kernel version that needs
> to work.
>
> [1]: https://docs.kernel.org/virt/kvm/arm/vcpu-features.html#the-id-registers
> [2]: https://docs.kernel.org/virt/kvm/arm/hypercalls.html
>
>> >> I've been bitten before with KVM differences between kernel versions
>> >> in the past - where the number of registers that userspace sees has
>> >> changed despite being on the same hardware.
>> >
>> > This is intended behavior, as VMs are initialized to the maximum feature
>> > set KVM is able to support. Forward-compatibility for the set of exposed
>> > registers is tested, see the get-reg-list selftest.
>> 
>> I've seen this problem come up as well; if it is clear that this is
>> something that KVM expects the VMM to handle if needed, that is fine;
>> should we consider "it's tested in a selftest" as a canonical indicator
>> of "this is what KVM supports compatibility wise"?
>
> It certainly is the level of compatibility that gets actively tested :)

:)

>
> The canonical reason for this behavior, though, is that KVM/arm64
> defaults to the maximum-possible feature set as discussed above.

/me contemplating "reverse" features, but too tired to think this
through on a Friday afternoon.