[PATCH 5/5] KVM: arm64: Exclude FP ownership from kvm_vcpu_arch

Mark Brown broonie at kernel.org
Thu Mar 7 06:26:30 PST 2024


On Thu, Mar 07, 2024 at 11:10:40AM +0000, Marc Zyngier wrote:
> Mark Brown <broonie at kernel.org> wrote:
> > On Wed, Mar 06, 2024 at 09:43:13AM +0000, Marc Zyngier wrote:
> > > Mark Brown <broonie at kernel.org> wrote:
> > > > On Sat, Mar 02, 2024 at 11:19:35AM +0000, Marc Zyngier wrote:

> > > > The SME patch series proposes adding an additional state to this
> > > > enumeration which would say if the registers are stored in a format
> > > > suitable for exchange with userspace, that would make this state part of
> > > > the vCPU state.  With the addition of SME we can have two vector lengths
> > > > in play so the series proposes picking the larger to be the format for
> > > > userspace registers.

> > > What does this addition have anything to do with the ownership of the
> > > physical register file? Not a lot, it seems.

> > > Specially as there better be no state resident on the CPU when
> > > userspace messes up with it.

> > If we have a situation where the state might be stored in memory in
> > multiple formats it seems reasonable to consider the metadata which
> > indicates which format is currently in use as part of the state.

> There is no reason why the state should be in multiple formats
> *simultaneously*. All the FP/SIMD/SVE/SME state is largely
> overlapping, and we only need to correctly invalidate the state that
> isn't relevant to writes from userspace.

I agree that we don't want to store multiple copies, the proposed
implementation for SME does that - when converting between guest native
and userspace formats we discard the original format after conversion
and updates the metadata which says which format is stored.  That
metadata is modeled as adding a new owner.

> > > > We could store this separately to fp_state/owner but it'd still be a
> > > > value stored in the vCPU.

> > > I totally disagree.

> > Where would you expect to see the state stored?

> Sorry, that came out wrong. I expect *some* vcpu state to describe the
> current use of the FP/vector registers, and that's about it. Not the
> ownership information.

Ah, I think the distinction here is that you're modeling the state
tracking updated by this patch as purely tracking the registers in which
case yes, we should just add a separate variable in the vCPU for
tracking the format of the data.  I was modeling the state tracking as
covering both the state in the registers and the state of the storage
backing them (since the guest state being in the registers means that
the state in memory is invalid).

> > > > Storing in a format suitable for userspace
> > > > usage all the time when we've got SME would most likely result in
> > > > performance overhead

> > > What performance overhead? Why should we care?
> > 
> > Since in situations where we're not using the larger VL we would need to
> > load and store the registers using a vector length other than the

...

> > and would instead need to manually compute the memory locations where
> > values are stored.  As well as the extra instructions when using the
> > smaller vector length we'd also be working with sparser data likely over
> > more cache lines.

> Are you talking about a context switch? or userspace accesses? I don't
> give a damn about the latter, as it statistically never happens. The
> former is of course of interest, but you still don't explain why the
> above is a problem.

Well, there will be a cost in any rewriting so I'm trying to shift it
away from context switch to userspace access since as you say they are
negligably frequent.

> Nothing prevent you from storing the registers using the *current* VL,
> since there is no data sharing between the SVE registers and the
> streaming-SVE ones. All you need to do is to make sure you don't mix
> the two.

You seem to be suggesting modeling the streaming mode registers as a
completely different set of registers in the KVM ABI.  That would
certainly be another option, though it's a bit unclear from an
architecture point of view and it didn't seem to entirely fit with the
handling of the FPSIMD registers when SVE is in use.  From an
architecture point of view the streaming mode registers aren't really a
separate set of registers.  When in streaming mode it is not possible to
observe the non-streaming register state and vice versa, and entering or
exiting streaming mode zeros the register state so functionally you just
have floating point registers.

You'd need to handle what happens with access to the inactive register
set from userspace, the simplest thing to implement would be to read the
logical value of zero and either discard or return an error on writes.

> > We would also need to consider if we need to zero the holes in the data
> > when saving, we'd only potentially be leaking information from the guest
> > but it might cause nasty surprises given that transitioning to/from
> > streaming mode is expected to zero values.  If we do need to zero then
> > that would be additional work that would need doing.

> The zeroing is mandated by the architecture, AFAIU. That's not optional.

Yes, we need to zero at some point - we could just do it on userspace
read though I think rather than having to do it when saving.

> > Spending more effort to implement something which also has more runtime
> > performance overhead for the case of saving and restoring guest state
> > which I expect to be vastly more common than the VMM accessing the guest
> > registers just doesn't seem like an appealing choice.

> I don't buy the runtime performance aspect at all. As long as you have
> the space to dump the largest possible VL, you can always dump it in
> the native format.

Sure, my point was that you appeared to be asking to dump in a
non-native format.  

> > If we were storing the data in the native format for the guest then that
> > format will change if streaming mode is changed via a write to SVCR.
> > This would mean that the host would need to understand that when writing
> > values SVCR needs to be written before the Z and P registers.  To be
> > clear I don't think this is a good idea.

> The architecture is crystal clear: you flip SVCR.SM, you loose all
> data in both Z and P regs. If userspace doesn't understand the
> architecture, that's their problem. The only thing we need to provide
> is a faithful emulation of the architecture.

It is not clear to me that the intention behind the KVM register ABI is
that it should have all the ordering requirements that the architecture
has, and decisions like hiding the V registers when SVE is active do
take us away from just a natural architectural point of view.  My
understanding was that it was intended that userspace should be able to
do something like just enumerate all the registers, save them and then
later on restore them without really understanding them.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20240307/388fcb3d/attachment.sig>


More information about the linux-arm-kernel mailing list