[PATCH 2/3] arm64: fpsimd: preserve/restore kernel mode NEON at context switch

Ard Biesheuvel ardb at kernel.org
Wed Nov 22 01:30:15 PST 2023


On Tue, 21 Nov 2023 at 21:25, Mark Brown <broonie at kernel.org> wrote:
>
> On Tue, Nov 21, 2023 at 12:45:48PM -0500, Ard Biesheuvel wrote:
> > On Tue, 21 Nov 2023 at 11:56, Mark Brown <broonie at kernel.org> wrote:
> > > This all works but I'm having a hard time loving the fact that it's
> > > not integrated with the cleverness that we're doing with KVM to switch
> > > out where a FP save gets written to, we've open coded a FPSIMD only
> > > register save and we'll have to extend that to support SVE or SME (which
> > > I'm pretty sure we'll want to do eventually).
>
> > How so? KVM only deals with user mode (guest) FPSIMD state, and never
> > uses the FPSIMD unit in kernel mode.
>
> Given that we only have one copy of the FP registers which is shared
> between all ELs I'm just thinking of this in terms of the FP registers,
> what owns them and where we write their contents to rather than that
> they're particularly user registers - the only thing that's specifically
> user mode about the existing fpsimd_save() is that it's got the check
> for TIF_FOREIGN_FPSTATE in there.

The user mode state is preserved eagerly but selectively (i.e., we
only preserve the pieces that were actually live), and restored
lazily, so that we never blow away useful user context and replace it
with something that is never used (either because the task never
returns to user space before being switched out, or because it is a
kernel thread which has no user space to begin with)

All the complexity related to user return, KVM etc is to handle with
these policies.

Kernel mode FPSIMD state is always preserved eagerly and always
restored eagerly, and always involves the contents of the NEON
registers only. So what do we gain by forcing ourselves to use the
complicated logic for a very simple use case that is completely
disjoint?

>  It is also called fpsimd_save() so it
> is where I'd expect to find all the code that saves FP things.

I can rename fpsimd_save() to fpsimd_save_user_state() and add a
fpsimd_save_kernel_state() that encapsulates the fpsimd_save_state()
call if that makes things clearer.

>  I would
> also expect that anything that takes ownership of the registers for use
> in a preemptable context where we might need to save would be using
> bind_state_to_cpu().

Why? This is part of the lazy restore logic that we have no need for.

>  Adding new, independent paths for managing the
> live register state feels like it's making things harder to reason
> about.
>
> To me what this change does is add another potential owner of the live
> FP state, I wouldn't expect the fact that this user is for kernel mode
> to change how we keep track of who owns that state.
>

The difference is that kernel mode FPSIMD state will always be
reloaded eagerly when switching to the task. None of the load/save
logic that reasons about who owns what, whether the current state is
the most recent one or whether it is SVE or SME is needed here.

> > Also, I don't think it makes lots of sense to complicate things now in
> > anticipation of kernel mode SVE/SME, which doesn't seem that
> > compelling to me. (The primary use case for kernel mode NEON is the
> > crypto extensions, which are only accessible via NEON registers to
> > begin with, and which provide a 5x-15x performance boost compared to
> > scalar code. SVE may support those crypto extensions too, but that
> > doesn't necessarily imply they will actually be faster than plain
> > NEON.)
>
> The preparation for kernel mode SVE/SME is as much a nice fallout that I
> noticed as it is a specific goal, my thinking here is more about
> consistency in how we work with the FP state.
>
> I do suspect we might see some useful crypto stuff for them at some
> point as the architecture and implementations advance, but I very much
> agree that it's premature to actually do anything since we've not yet
> seen any concrete examples.
>

Exactly. And if there are any data points you can share, I'm happy to
look into this from the crypto side.

> > Modulo future support for kernel mode SVE/SME, I don't think there is
> > a need for all of this. But of course, I could be the one missing
> > something here.
>
> I'm not sure it's a huge refactor, it's mostly just the rename?  It
> seemed proportionate to the change anyway.  I've not actually *done* it
> though so it might be much more involved than I'm imagining.  I might
> have a look tomorrow if I get time, I need to poke at that code anyway
> for some other things.
>

OK.

> Oh, and BTW I should mention my series for the 2023 DPISA:
>
>    https://lore.kernel.org/linux-arm-kernel/20231114-arm64-2023-dpisa-v2-0-47251894f6a8@kernel.org/
>
> Once that's merged there'll need to be some consideration of FPMR,
> though as far as I'm aware nothing in there impacts any crypto stuff so
> it should't cause any immediate problems to ignore it.

Thanks for the head's up. Agree that this seems only relevant to user space.



More information about the linux-arm-kernel mailing list