[PATCH 11/18] arm64: fpsimd: Split FPSR/FPCR from SVE save/restore

Mark Rutland mark.rutland at arm.com
Wed May 27 06:51:13 PDT 2026


On Tue, May 26, 2026 at 05:28:21PM +0100, Mark Brown wrote:
> On Thu, May 21, 2026 at 02:25:49PM +0100, Mark Rutland wrote:
> > Regardless of whether the vector registers are saved in FPSIMD or SVE
> > format, we store FPSR and FPCR in user_fpsimd_state::{fpsr,fpcr}.
> 
> ...
> 
> > Note that the SVE assembly sequence for restoring FPCR uses an
> > unconditional write to FPCR. The plain FPSIMD assembly sequence has used
> > a conditional write to FPCR since 2014 in commit:
> 
> >   5959e25729a5 ("arm64: fpsimd: avoid restoring fpcr if the contents haven't change")
> 
> > ... but this was not followed for the SVE restore assembly implemented
> > in 2017 in commit:
> 
> >   1fc5dce78ad1 ("arm64/sve: Low-level SVE architectural state manipulation functions")
> 
> > ... so I've assumed that this doesn't actually matter in practice, and
> > implemented the C version matching the existing SVE assembly.
> 
> > For the moment, fpsimd_save_state() and fpsimd_load_state() are left
> > as-is with their own logic to save/restore FPSR and FPCR. This will be
> > unified in subsequent patches.
> 
> There is a possibility that it only matters for older, FPSIMD only CPUs
> or just that nobody got round to benchmarking this on physical CPUs with
> SVE and in fact a similar optimisation is also useful there.

All of that might be true, but that doesn't change my assessment that
this doesn't seem to matter in practice, and given that the overall goal
of this series is to *simplify* things, I'd much rather err towards that
than hypothetical performance concerns.

> I'm a bit wary of dropping the optimisation without any verification
> of the performance impact, but equally I'm not aware of a specific
> benchmark that showed the impact or even if there was one in the first
> place.  The changelog sounds like the optimisation might've been
> written based on inspection alone, I don't know if anyone will
> remember more than a decade later.

>From what I remember, the changes in commit 5959e25729a5 were made based
on intuition, inspired by a contemporary retrospective change to the
architecture that made FPCR self-synchronizing. Previously the
architecture required a context synchronization event for the write to
take effect, but implementations happened to be stronger.

The conditional write isn't necessarily a win, because the cost of
recovering from a branch mispredict can be much larger than the cost of
micro-architectural mechanisms to ensure that FPCR is
self-synchronizing.

> Having said all that given that a conditional update is simple to
> implement in C it seems safer to add one in the SVE path than to drop
> it from the FPSIMD path.

I agree that if we need to, it would be simple to add this.

For now I'm going to leave this as-is given the rationale I originally
provided. This patch specifically doesn't change the existing behaviour.
I don't think this matters in practice, we haven't consistently applied
this approach to FPCR (or other similar registers), and omitting this
makes the code simpler.

Mark.



More information about the linux-arm-kernel mailing list