[PATCH v4 0/4] KVM: arm64: Fix underallocation of storage for SVE state

Mon Jul 8 08:30:30 PDT 2024

Hi all,

On Fri, Jul 05, 2024 at 06:18:50PM +0100, Mark Brown wrote:
> On Fri, Jul 05, 2024 at 02:20:05PM +0100, Marc Zyngier wrote:
> > Mark Brown <broonie at kernel.org> wrote:
> 
> > > As observed during review the pKVM support for saving host SVE state is
> > > broken if an asymmetric system has VLs larger than the maximum shared
> > > VL, fix this by discovering then using the maximum VL for allocations
> > > and using RDVL during the save/restore process.
> 
> > I really don't see why we need such complexity here.

The first patch is orthogonal cleanup, and the rest doesn't really add
complexity IIUC.

> > Fuad did post something[1] that did the trick with a far less invasive
> > change, and it really feels like we are putting the complexity at the
> > wrong place.
> 
> > So what's wrong with that approach? I get that you want to shout about
> > secondary CPUs, but that's an orthogonal problem.
> 
> As I've said from a clarity/fragility point of view I'm not happy with
> configuring the vector length to one value then immediately doing things
> that assume another value, even if everything is actually lined up
> in a way that works.  Having uncommented code where you have to go and
> check correctness when you see it isn't great, seeing an inconsistency
> just raises alarm bells.  It is much clearer to write the code in a way
> that makes it obvious that the VL we are using is the one the hardware
> is using, for the host save/restore reading the actual VL back seemed
> like the most straightforward way to do that.
> 
> A similar thing applies with the enumeration code - like I said in reply
> to one of Fuad's postings I originally wrote something that's basically
> the same as the patch Faud posted but because it is not consistent with
> the surrounding code in how it approaches things it was just raising
> questions about if the new code was missing something, or if there was
> some problem that should be addressed in the existing code.  Rather than
> write an extensive changelog and/or comments covering these
> considerations it seemed more better to just write the code in a
> consistent manner so the questions aren't prompted.  Keeping the
> approach consistent is a bit more code right now but makes the result
> much easier to reason about.
> 
> The late CPUs thing is really just an answer to the initial "why is this
> different, what might we have missed?" question rather than a particular
> goal itself.  Adding a warning is as much about documenting the handling
> of late CPUs as it is about highlighting any unfortunate implementation
> choices we run into.
> 
> Basically it's maintainability concerns, especially with the enumeration
> code.

I tend to agree here.

It's probably best to stick to one convention everywhere about how the
SVE regs are laid out for a given VL.  There's nothing wrong with Fuad's
fixed sve_ffr_offset(), but it's different from the VL-dependent offset
already used elsewhere and so risks causing confusion further down the
line.

One thing confuses me:

The host could never use over-max VLs except in non-preemptible kernel
code, since code doing that would be non-migratable to other physical
CPUs.  This is done to probe SVE only, and the extra bits in the vector
registers are never used at all.

Can't pKVM just hide the non symmetrically supported VLs using ZCR_EL2,
just as regular KVM does for the guest?

(I may be making bad assumptions about pKVM's relationship with the host
kernel.)

Cheers
---Dave