[PATCH 12/18] arm64: fpsimd: Move fpsimd save/restore inline

Mark Rutland mark.rutland at arm.com
Thu May 28 09:15:14 PDT 2026


On Tue, May 26, 2026 at 05:44:26PM +0100, Mark Brown wrote:
> On Thu, May 21, 2026 at 02:25:50PM +0100, Mark Rutland wrote:
> 
> > Note that I've used the SVE sequence for restoring FPCR, which uses an
> > unconditional write to FPCR. The plain FPSIMD assembly sequence used a
> > conditional write to FPCR since 2014 in commit:
> 
> >   5959e25729a5 ("arm64: fpsimd: avoid restoring fpcr if the contents haven't change")
> 
> > ... but this was not followed for the SVE assembly implemented in 2017
> > in commit:
> 
> >   1fc5dce78ad1 ("arm64/sve: Low-level SVE architectural state manipulation functions")
> 
> > ... so I've assumed that this doesn't actually matter in practice, and
> > I've erred in favour of the simpler sequence.
> 
> As I said on the earlier patch I'm a bit nervous about assuming this
> doesn't matter for anyone without verifying (though I wouldn't be
> surprised if that turned out to be the case) but that's internal to that
> patch and this is obviously a great improvement so:
> 
> Reviewed-by: Mark Brown <broonie at kernel.org>

Based on that discussion on the last patch, I've updated the commit
message for this patch say:

  I've used the SVE sequence for restoring FPCR, which uses an
  unconditional write to FPCR, rather than the conditional write used by
  the FPSIMD assembly sequence. I believe that in practice, this doesn't
  matter to a real workload, and given it's possible for the
  mis-predicted branch to cost more than the necessary
  micro-architectural synchronization, I strongly suspect any
  performance impact is within the noise. 
  
  Looking at the history, the FPSIMD assembly sequence was changed to
  use a conditional write to FPCR since 2014 in commit:
  
    5959e25729a5 ("arm64: fpsimd: avoid restoring fpcr if the contents haven't change")
  
  ... as described in the commit message, this was based on an
  expectation of implementation style, and was not based on
  benchmarking.

Mark.



More information about the linux-arm-kernel mailing list